home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Source Code 1992 March
/
Source_Code_CD-ROM_Walnut_Creek_March_1992.iso
/
usenet
/
altsrcs
/
3
/
3147
/
README
< prev
next >
Wrap
Text File
|
1991-03-28
|
38KB
|
1,004 lines
To: oesterle@wpi.wpi.edu
Subject: Re: Binary Tree Re-balancing
Newsgroups: comp.lang.c
In-Reply-To: <10403@wpi.wpi.edu>
Organization: Harris Computer Systems, Fort Lauderdale, FL
Cc:
Bcc:
In article <10403@wpi.wpi.edu> you write:
>
>Greetings!
>
> I would like to make a request for some references or C code
>for balancing binary trees after insertion/deletion. I have been
>looking at two books (_Algorithms + Data Structures = Programs_,
>by Nicklaus Wirth; and _Algorithms: Their Complexity and
>Efficiency_ by Lydia Kronsj:o). Both the books' code is in
>PASCAL; I would more or less want to compare to see if I am
>interpreting the code correctly, since both books implement the
>balancing differently and I still want to implement it
>differently than both of the books. I plan to keep the balancing
>information in structures for speed. Simplicity is much
>desirable, recursive is great.
>
I have spent a couple of years doing exactly this. I have implemented
what I feel is a very elegant and easy to understand solution and the
result is a C library for handling AVL trees as an abstract type. If
you want me to mail my code, then let me know! I will give a bit of a
discussion though.
First of all, I use a balance factor that ranges from -2 .. 2,
Many texts Ive seen use -1 .. 1 but I feel -2 .. 2 is easier to
understand and provides for more simple balance updating.
It is also UNNECESSARY to use separate routines to do left rotations,
and right rotations, one routine that takes the direction as an
extra parameter will do.
CALCULATING NEW BALANCES AFTER A ROTATION:
==========================================
To calculate the new balances after a single left rotation; assume we have
the following case:
A B
/ \ / \
/ \ / \
a B ==> A c
/ \ / \
/ \ / \
b c a b
The left is what the tree looked like BEFORE the rotation and the right
is what the tree looks like after the rotation. Capital letters are used
to denote single nodes and lowercase letters are used to denote subtrees.
The "balance" of a tree is the height of its right subtree less the
height of its left subtree. Therefore, we can calculate the new balances
of "A" and "B" as follows (ht is the height function):
NewBal(A) = ht(b) - ht(a)
OldBal(A) = ht(B) - ht(a)
= ( 1 + max (ht(b), ht(c)) ) - ht(a)
subtracting the second equation from the first yields:
NewBal(A) - OldBal(A) = ht(b) - ( 1 + max (ht(b), ht(c)) )
+ ht(a) - ht(a)
canceling out the ht(a) terms and adding OldBal(A) to both sides yields:
NewBal(A) = OldBal(A) - 1 - (max (ht(b), ht(c)) - ht(b) )
Noting that max(x, y) - z = max(x-z, y-z), we get:
NewBal(A) = OldBal(A) - 1 - (max (ht(b) - ht(b), ht(c) - ht(b)) )
But ht(c) - ht(b) is OldBal(B) so we get:
NewBal(A) = OldBal(A) - 1 - (max (0, OldBal(B)) )
= OldBal(A) - 1 - max (0, OldBal(B))
Thus, for A, we get the equation:
NewBal(A) = OldBal(A) - 1 - max (0, OldBal(B))
To calculate the Balance for B we perform a similar computation:
NewBal(B) = ht(c) - ht(A)
= ht(c) - (1 + max(ht(a), ht(b)) )
OldBal(B) = ht(c) - ht(b)
subtracting the second equation from the first yields:
NewBal(B) - OldBal(B) = ht(c) - ht(c)
+ ht(b) - (1 + max(ht(a), ht(b)) )
canceling, and adding OldBal(B) to both sides gives:
NewBal(B) = OldBal(B) - 1 - (max(ht(a), ht(b)) - ht(b))
= OldBal(B) - 1 - (max(ht(a) - ht(b), ht(b) - ht(b))
But ht(a) - ht(b) is - (ht(b) - ht(a)) = -NewBal(A), so ...
NewBal(B) = OldBal(B) - 1 - max( -NewBal(A), 0)
Using the fact that min(x,y) = -max(-x, -y) we get:
NewBal(B) = OldBal(B) - 1 + min( NewBal(A), 0)
So, for a single left rotation we have shown the the new balances
for the nodes A and B are given by the following equations:
NewBal(A) = OldBal(A) - 1 - max(OldBal(B), 0)
NewBal(B) = OldBal(B) - 1 + min(NewBal(A), 0)
Now let us look at the case of a single right rotation. The case
we will use is the same one we used for the single left rotation
only with all the left and right subtrees switched around so that
we have the mirror image of the case we used for our left rotation.
A B
/ \ / \
/ \ / \
B a ==> c A
/ \ / \
/ \ / \
c b b a
If we perform the same calculations that we made for the left rotation,
we will see that the new balances for a single right rotation are given
by the following equations:
NewBal(A) = OldBal(A) + 1 - min(OldBal(B), 0)
NewBal(B) = OldBal(B) + 1 + max(NewBal(A), 0)
Hence, the C code for single left and right rotations would be:
#define LEFT 0
#define RIGHT 1
#define MAX(a,b) ( (a) > (b) ? (a) : (b) )
#define MIN(a,b) ( (a) < (b) ? (a) : (b) )
typedef struct avl_node {
DATA_TYPE data;
short bal;
struct avl_node *subtree[2];
} AVL_NODE, *AVL_TREE; /* avl_node */
void rotate_left (tree)
AVL_TREE tree;
{
AVL_TREE old_root = tree;
/* perform rotation */
tree = tree->subtree[RIGHT];
old_root->subtree[RIGHT] = tree->subtree[LEFT];
tree->subtree[LEFT] = old_root;
/* update balances */
old_root->bal -= ( 1 + MAX(tree->bal, 0) );
tree->bal -= ( 1 - MIN(old_root->bal, 0) );
}/* rotate_left */
void rotate_right (tree)
AVL_TREE tree;
{
AVL_TREE old_root = tree;
/* perform rotation */
tree = tree->subtree[LEFT];
old_root->subtree[LEFT] = tree->subtree[RIGHT];
tree->subtree[RIGHT] = old_root;
/* update balances */
old_root->bal += ( 1 - MIN(tree->bal, 0) );
tree->bal += ( 1 + MAX(old_root->bal, 0) );
}/* rotate_right */
We can make this code more compact however by using only ONE
rotate routine which takes an additional parameter: the direction
in which to rotate. Notice that I have defined LEFT, and RIGHT to
be mnemonic constants to index into an array of subtrees. I can
pass the constant LEFT or RIGHT to the rotation routine and it can
calculate the direction opposite the given direction by subtracting
the given direction from the number one.
It does not matter whether LEFT is 0 or RIGHT is 0 as long as one
of them is 0 and the other is 1. If this is the case, then:
1 - LEFT = RIGHT
and
1 - RIGHT = LEFT
Using this and the same type definitions as before (and the same
macros), the C source for a single rotation becomes:
#define OTHER_DIR(x) ( 1 - (x) )
typedef short DIRECTION
void rotate (tree, dir)
AVL_TREE tree;
DIRECTION dir;
{
AVL_TREE old_root = tree;
DIRECTION other_dir = OTHER_DIR(dir);
/* rotate */
tree = tree->subtree[other_dir];
old_root->subtree[other_dir] = tree->subtree[dir];
tree->subtree[dir] = old_root;
/* update balances */
if ( dir == LEFT ) {
old_root->bal -= ( 1 + MAX(tree->bal, 0) );
tree->bal -= ( 1 - MIN(old_root->bal, 0) );
}/* if */
else /* dir == RIGHT */ {
old_root->bal += ( 1 - MIN(tree->bal, 0) );
tree->bal += ( 1 + MAX(old_root->bal, 0) );
}/* else */
}/* rotate */
We can compact this code even further if we play around with the
equations for updating the balances. Let us use the fact that
max(x,y) = -min(-x,-y):
for a left rotation
------------------------------------------------
old_root->bal -= ( 1 + MAX(tree->bal, 0) );
tree->bal -= ( 1 - MIN(old_root->bal, 0) );
for a right rotation
------------------------------------------------
old_root->bal += ( 1 - MIN(tree->bal, 0) );
tree->bal += ( 1 + MAX(old_root->bal, 0) );
Using the above rule to change all occurrences of "MIN" to "MAX"
these equations become:
for a left rotation
------------------------------------------------
old_root->bal -= ( 1 + MAX( +(tree->bal), 0) );
tree->bal -= ( 1 + MAX( -(old_root->bal), 0) );
for a right rotation
------------------------------------------------
old_root->bal += ( 1 + MAX( -(tree->bal), 0) );
tree->bal += ( 1 + MAX( +(old_root->bal), 0) );
Note that the difference between updating the balances for our right
and left rotations is only the occurrence of a '+' where we would
like to see a '-' in the assignment operator, and the sign of the
first argument to the MAX macro. If we had a function that would
map LEFT to +1 and RIGHT to -1 we could multiply by the result
of that function to update our balances. Such a function is
f(x) = 1 - 2x
"f" maps 0 to 1 and maps 1 to -1. This function will NOT map LEFT
and RIGHT to the same value regardless of which is 1 and which is
0 however. If we wish our function to have this property then we
can multiply (1 - 2x) by (RIGHT - LEFT) so that the result "switches"
signs accordingly depending upon whether LEFT is 0 or RIGHT is 0.
This defines a new function "g":
g(x) = (1 - 2x)(RIGHT - LEFT)
If LEFT = 0 and RIGHT = 1 then:
g(LEFT) = (1 - 2*0)(1 - 0) = 1*1 = 1
g(RIGHT) = (1 - 2*1)(1 - 0) = (-1)*1 = -1
If LEFT = 1 and RIGHT = 0 then:
g(LEFT) = (1 - 2*1)(0 - 1) = (-1)*(-1) = 1
g(RIGHT) = (1 - 2*0)(0 - 1) = 1*(-1) = -1
So, as desired, the function "g" maps LEFT to +1 and RIGHT to -1
regardless of which is 0 and which is 1.
Now, if we introduce a new variable called "factor" and assign
it the value "g(dir)", we may update the balances in our rotation
routine without using a conditional statement:
for a rotation in the "dir" direction
------------------------------------------------------------
old_root->bal -= factor * ( 1 + MAX( factor * tree->bal, 0) );
tree->bal += factor * ( 1 + MAX( factor * old_root->bal, 0) );
Using this, the new code for our rotation routine becomes:
#define OTHER_DIR(x) ( 1 - (x) )
typedef short DIRECTION
void rotate (tree, dir)
AVL_TREE tree;
DIRECTION dir;
{
AVL_TREE old_root = tree;
DIRECTION other_dir = OTHER_DIR(dir);
short factor = (RIGHT - LEFT) * (1 - (2 * dir));
/* rotate */
tree = tree->subtree[other_dir];
old_root->subtree[other_dir] = tree->subtree[dir];
tree->subtree[dir] = old_root;
/* update balances */
old_root->bal -= factor * ( 1 + MAX( factor * tree->bal, 0) );
tree->bal += factor * ( 1 + MAX( factor * old_root->bal, 0) );
}/* rotate */
However, although this second version of "rotate" is more compact and
doesn't require the use of a conditional test on the variable "dir",
It may actually run slower than our first version of "rotate" because
the time required to make the "test" may well be less than the time
required to perform the additional multiplications and subtractions.
Now a double rotation can be implemented as a series of single rotations:
/*
* rotate_twice() -- rotate a given node in the given direction
* and then in the opposite direction
* to restore the balance of a tree
*/
void rotate_twice( rootp, dir )
AVLtree *rootp;
DIRECTION dir;
{
DIRECTION other_dir = OPPOSITE( dir );
rotate_once( &( (*rootp) -> subtree[ other_dir ] ), other_dir );
rotate_once( rootp, dir );
}/* rotate_twice */
ANOTHER METHOD FOR CALCULATING BALANCES AFTER ROTATION:
=======================================================
One may use a different method than the one described above which
is perhaps simpler. Note however that the method for updating balances
described above works regardless of what numbers the balance factor
may contain (as long as they are correct -- it works, no matter how
imbalanced). If we take into account some of conditions that cause
a rotation, we have more information to work with (such that the
node to be rotated has a balance of +2 or -2 etc..)
For a single LL rotation we have one of two possibilities:
A B
/ \ / \
/ \ / \
a B ==> A c
/ \ / \
/ \ / \
b c a b
==============================================================
BALANCE FACTORS
BEFORE ROTATION AFTER ROTATION
------------------ ----------------
case 1) A = +2 ; B = +1 A = 0 ; B = 0
case 2) A = +2 ; B = 0 A = +1 ; B = -1
==============================================================
so in either case NewB = OldB -1 and newA = -newB so we get
A = - (--B) for a single left rotation.
For a single RR rotation the possibilities are (The picture is a
mirror image (swap all right and left kids of each node) of the LL one)
==============================================================
BALANCE FACTORS
BEFORE ROTATION AFTER ROTATION
------------------ ----------------
case 1) A = -2 ; B = -1 A = 0 ; B = 0
case 2) A = -2 ; B = 0 A = -1 ; B = +1
==============================================================
so in either case NewB = OldB +1 and newA = -newB so we get
A = - (++B) for a single left rotation.
This means that we can use the following to update balances:
/* Directional Definitions */
typedef short DIRECTION;
#define LEFT 0
#define RIGHT 1
#define OPPOSITE(x) ( 1 - (x) )
/* return codes used by avl_insert(), avl_delete(), and balance() */
#define HEIGHT_UNCHANGED 0
#define HEIGHT_CHANGED 1
/*
* rotate_once() -- rotate a given node in the given direction
* to restore the balance of a tree
*/
short rotate_once( rootp, dir )
AVLtree *rootp;
DIRECTION dir;
{
DIRECTION other_dir = OPPOSITE( dir ); /* opposite direction to "dir" */
AVLtree old_root = *rootp; /* copy of original root of tree */
short ht_unchanged; /* true if height unchanged */
/* Here we need to take into account the special case that occurs when a
** single rotation was made but the HEIGHT of the rotated tree did NOT
** change as a result of the rotation (we will need this later)
*/
ht_unchanged = ( (*rootp) -> subtree[ other_dir ] -> bal ) ? FALSE : TRUE;
/* assign new root */
*rootp = old_root -> subtree[ other_dir ];
/* new-root exchanges it's "dir" subtree for it's parent */
old_root -> subtree[ other_dir ] = (*rootp) -> subtree[ dir ];
(*rootp) -> subtree[ dir ] = old_root;
/* update balances */
old_root -> bal = -( dir == LEFT ? --( (*rootp) -> bal )
: ++( (*rootp) -> bal ) );
return ht_unchanged;
}/* rotate_once */
We get an even nicer scenario when we look at LR and RL rotations.
For a double LR rotation we have one of three possibilities:
A B
/ \ / \
/ \ / \
a C ==> A C
/ \ / \ / \
/ \ / | | \
B c a b1 b2 c
/ \
/ \
b1 b2
==============================================================
BALANCE FACTORS
BEFORE ROTATION AFTER ROTATION
------------------------ -----------------------
A = +2 ; C = -1 ; B = +1 A = -1 ; B = 0 ; C = 0
A = +2 ; C = -1 ; B = 0 A = 0 ; B = 0 ; C = 0
A = +2 ; C = -1 ; B = -1 A = 0 ; B = 0 ; C =+1
==============================================================
So we get, in all three cases:
newA = -max( oldB, 0 )
newC = -min( oldB, 0 )
newB = 0
Now for a double RL rotation we have the following possibilities (again, the
picture is the mirror image of the LR case):
==============================================================
BALANCE FACTORS
BEFORE ROTATION AFTER ROTATION
------------------------ -----------------------
A = -2 ; C = +1 ; B = +1 A = -1 ; B = 0 ; C = 0
A = -2 ; C = +1 ; B = 0 A = 0 ; B = 0 ; C = 0
A = -2 ; C = +1 ; B = -1 A = 0 ; B = 0 ; C =+1
==============================================================
So we get, in all three cases:
newA = -max( oldB, 0 )
newC = -min( oldB, 0 )
newB = 0
This is EXACTLY what we had for the LR case (isnt that nice!!!) so now we can
code up a double rotation as follows:
/*
* rotate_twice() -- rotate a given node in the given direction
* and then in the opposite direction
* to restore the balance of a tree
*/
void rotate_twice( rootp, dir )
AVLtree *rootp;
DIRECTION dir;
{
DIRECTION other_dir = OPPOSITE( dir );
AVLtree old_root = *rootp,
old_other_dir_subtree = (*rootp) -> subtree[ other_dir ];
/* assign new root */
*rootp = old_root -> subtree[ other_dir ] -> subtree[ dir ];
/* new-root exchanges it's "dir" subtree for it's grandparent */
old_root -> subtree[ other_dir ] = (*rootp) -> subtree[ dir ];
(*rootp) -> subtree[ dir ] = old_root;
/* new-root exchanges it's "other-dir" subtree for it's parent */
old_other_dir_subtree -> subtree[ dir ] = (*rootp) -> subtree[ other_dir ];
(*rootp) -> subtree[ other_dir ] = old_other_dir_subtree;
/* update balances */
(*rootp) -> subtree[ LEFT ] -> bal = -MAX( (*rootp) -> bal, 0 );
(*rootp) -> subtree[ RIGHT ] -> bal = -MIN( (*rootp) -> bal, 0 );
(*rootp) -> bal = 0;
}/* rotate_twice */
Now isnt that special! Now that we have the rotation routines written,
we just need to worry about when to call them. One help is a routine called
balance which is called when a node gets too heavy on a particular side.
Here we need to take into account the special case that occurs when a
single rotation was made but the HEIGHT of the rotated tree did NOT change
as a result of the rotation (discussed in more detail in the next section):
/* return codes used by avl_insert(), avl_delete(), and balance() */
#define HEIGHT_UNCHANGED 0
#define HEIGHT_CHANGED 1
/* Balance Definitions */
#define LEFT_HEAVY -1
#define BALANCED 0
#define RIGHT_HEAVY 1
#define LEFT_IMBALANCE(nd) ( (nd)->bal < LEFT_HEAVY )
#define RIGHT_IMBALANCE(nd) ( (nd)->bal > RIGHT_HEAVY )
/*
* balance() -- determines and performs the sequence of rotations needed
* (if any) to restore the balance of a given tree.
*
* Returns 1 if tree height changed due to rotation; 0 otherwise
*/
short balance( rootp )
AVLtree *rootp;
{
short special_case = FALSE;
if ( LEFT_IMBALANCE( *rootp ) ) { /* need a right rotation */
if ( (*rootp) -> subtree[ LEFT ] -> bal == RIGHT_HEAVY )
rotate_twice( rootp, RIGHT ); /* double RL rotation needed */
else /* single RR rotation needed */
special_case = rotate_once( rootp, RIGHT );
}/* if */
else if ( RIGHT_IMBALANCE( *rootp ) ) { /* need a left rotation */
if ( (*rootp) -> subtree[ RIGHT ] -> bal == LEFT_HEAVY )
rotate_twice( rootp, LEFT ); /* double LR rotation needed */
else /* single LL rotation needed */
special_case = rotate_once( rootp, LEFT );
}/* elif */
else return HEIGHT_UNCHANGED; /* no rotation occurred */
return ( special_case ) ? HEIGHT_UNCHANGED : HEIGHT_CHANGED;
}/* balance */
This routine helps but now comes the hard part (IMHO at least), figuring
out when the height of the current subtree has changed.
CALCULATING WHEN THE HEIGHT OF THE CURRENT SUBTREE HAS CHANGED:
===============================================================
After we have inserted or deleted a node from the current subtree, we
need to determine if the total height of the current tree has changed
so that we may pass the information up the recursion stack to previous
instantiations of the insertion and deletion routines.
Let us first consider the case of an insertion. The simplest case is
at the point where the insertion occurred. Since we have created a node
where one did not previously exist, we have increased the height of the
inserted node from 0 to 1. Therefore we need to pass the value 1 (I will
use "1" for TRUE and "0" for FALSE) back to the previous level of
recursion to indicate the increase in the height of the current subtree.
| after insertion |
NULL ================> |
A
The remaining cases for an insertion are almost as simple. If a 0 (FALSE)
was the "height-change-indicator" passed back by inserting into a subtree
of the current level, then there is no height change at the current level.
It is true that the structure of one of the subtrees may have changed due
to an insertion and/or rotation, but since the height of the subtree did
not change, neither did the height of the current level.
| after insertion |
| ================> |
A A
/ \ / \
/ \ / \
b c b d
If the current level is balanced after inserting the node (but before
attempting any rotations) then we just made one subtree equal in height
to the other. Therefore the overall height of the current level remains
unchanged and a 0 is returned.
| after insertion |
| ================> |
A A
/ / \
/ / \
b b c
Before we go and write avl_insert, we will need some help from a few other
small routines, most of which are needed for deletion more than for insertion.
These routines will free/create a node, and tell the type of a node.
/* ckalloc( size ) -- allocate space; check for success */
void *ckalloc( size )
int size;
{
void *malloc(), *ptr;
if ( (ptr = (char *) malloc( (unsigned) size)) == NULL ) {
fprintf( stderr, "Unable to allocate storage." );
exit( 1 );
}/* if */
return ptr;
}/* ckalloc */
/*
* new_node() -- get space for a new node and its data;
* return the address of the new node
*/
AVLtree new_node( data, size )
void *data;
unsigned size;
{
AVLtree root;
root = (AVLtree) ckalloc( sizeof (AVLnode) );
root -> data = (void *) ckalloc( size );
memmove( root -> data, data, size );
root -> bal = BALANCED;
root -> subtree[ LEFT ] = root -> subtree[ RIGHT ] = NULL_TREE;
return root;
}/* new_node */
/*
* free_node() -- free space for a node and its data!
* reset the node pointer to NULL
*/
void free_node( rootp )
AVLtree *rootp;
{
free( (void *) *rootp );
*rootp = NULL_TREE;
}/* free_node */
/*
* node_type() -- determine the number of null pointers for a given
* node in an AVL tree, Returns a value of type NODE
* which is an enumeration type with the following values:
*
* IS_TREE -- both subtrees are non-empty
* IS_LBRANCH -- left subtree is non-empty; right is empty
* IS_RBRANCH -- right subtree is non-empty; left is empty
* IS_LEAF -- both subtrees are empty
* IS_NULL -- given tree is empty
*/
NODE node_type( tree )
AVLtree tree;
{
if ( tree == NULL_TREE )
return IS_NULL;
else if ( (tree -> subtree[ LEFT ] != NULL_TREE)
&& (tree -> subtree[ RIGHT ] != NULL_TREE) )
return IS_TREE;
else if ( tree -> subtree[ LEFT ] != NULL_TREE )
return IS_LBRANCH;
else if ( tree -> subtree[ RIGHT ] != NULL_TREE )
return IS_RBRANCH;
else
return IS_LEAF;
}/* node_type */
Now the last helper routines we need will be a dirty trick. We need a function
to compare items while we traverse the tree. Normally, we expect this compare
function to return a strcmp() type result (<0, =0, or >0 for <,=,> respectively)
We will be a little sneaky and write our own compare function which will take
the supplied compare function, and the node-type of the node being compared
against. This will help in deletion when we want to delete the minimal/maximal
element of a given tree. We will go about it in such a way that we can delete
or find a given item, or the min/max item in a tree. We do this as follows:
/*
* avl_min() -- compare function used to find the minimal element in a tree
*/
int avl_min( elt1, elt2, nd_typ )
void *elt1, *elt2;
NODE nd_typ;
{
if ( nd_typ == IS_RBRANCH || nd_typ == IS_LEAF )
return 0; /* left subtree is empty -- this is the minimum */
else return -1; /* keep going left */
}/* avl_min */
/*
* avl_max() -- compare function used to find the maximal element in a tree
*/
int avl_max( elt1, elt2, nd_typ )
void *elt1, *elt2;
NODE nd_typ;
{
if ( nd_typ == IS_RBRANCH || nd_typ == IS_LEAF )
return 0; /* right subtree is empty -- this is the maximum */
else return 1; /* keep going right */
}/* avl_max */
Now we can use avl_min/avl_max as compare functions. If the compare
function is other than one of these two, usually it will only use the first
two parameters (so it gets an extra one). This is not great for portability
so we should really do something like:
/*
* avl_compare() -- compare an item with a node-item in an avl tree
*/
int avl_compare( elt1, elt2, nd_typ, compar )
void *elt1, *elt2;
NODE nd_typ;
int (*compar)();
{
if ( compare == avl_min || compar == avl_max )
return (*compar)( elt1, elt2, nd_typ );
else
return (*compare)( elt1, elt2 );
}/* avl_compare */
If you wish to do this you may, but my code works without it, it just ignores
any extra parameters. If you have ANSI-C you may get some compiler complaints.
In any case, I shall proceed without using avl_compare(). In addition, avl_min
and avl_max should only be passed to avl_find (search without inserting),
and avl_delete (NOT avl_insert). We are now ready to write avl_insert:
/*
* avl_insert() -- insert an item into the given tree
*
* PARAMETERS:
* data -- a pointer to a pointer to the data to add;
* On exit, *data is NULL if insertion succeeded,
* otherwise address of the duplicate key
* rootp -- a pointer to an AVL tree
* compar -- name of the function to compare 2 data items
*/
short avl_insert( data, rootp, compar )
void **data;
AVLtree *rootp;
int (*compar)();
{
short increase;
int cmp;
if ( *rootp == NULL_TREE ) { /* insert new node here */
*rootp = new_node( *data, SIZE_OF_DATA )) );
*data = NULL; /* set return value in data */
return HEIGHT_CHANGED;
}/* if */
cmp = (*compar)( *data, (*rootp) -> data ); /* compare data items */
if ( cmp < 0 ) { /* insert into the left subtree */
increase = -avl_insert( data, &( (*rootp) -> subtree[ LEFT ] ), compar );
if ( *data != NULL ) return HEIGHT_UNCHANGED;
}/* elif */
else if ( cmp > 0 ) { /* insert into the right subtree */
increase = avl_insert( data, &( (*rootp) -> subtree[ RIGHT ] ), compar );
if ( *data != NULL ) return HEIGHT_UNCHANGED;
}/* elif */
else { /* data already exists */
*data = (*rootp) -> data; /* set return value in data */
return HEIGHT_UNCHANGED;
} /* else */
(*rootp) -> bal += increase; /* update balance factor */
/************************************************************************
* re-balance if needed -- height of current tree increases only if its
* subtree height increases and the current tree needs no rotation.
************************************************************************/
if ( increase && (*rootp) -> bal )
return ( 1 - balance( rootp ) );
else
return HEIGHT_UNCHANGED;
}/* avl_insert */
Deletion is more complicated than insertion. The height of the current level
may decrease for two reasons: either a rotation occurred to decrease the
height of a subtree (and hence the current level), or a subtree shortened
in height resulting in a now balanced current level (subtree was "trimmed
down" to the same size as the other). Just because a rotation has occurred
however, does not mean that the subtree height has decreased. There is a
special case where rotating preserves the current subtree height.
Suppose I have a tree as follows:
C
/ \
A E
/ \
D F
Deleting "A" results in the following (imbalanced) tree:
C
\
E
/ \
D F
This type of imbalance cannot occurr during insertion, only during
deletion. Notice that the root has a balance of 2 but its heavy subtree
has a balance of zero (the other case would be a -2 and a 0). Performing
a single left rotation to restore the balance results in:
E
/ \
C F
\
D
This tree has the same height as it did before it was rotated. Hence,
we may determine if deletion caused the subtree height to change by
seeing if one of the following occurred:
1) If the new balance (after deletion) is zero and NO rotation
took place.
2) If a rotation took place but was NOT one of the special rotations
mentioned above (a -2:0 or a 2:0 rotation).
For insertion, we only needed to check if a rotation occurred to see if the
subtree height had changed. But for deletion we need to ckeck all of the above.
So for deletion of a node we have:
/*
* avl_delete() -- delete an item from the given tree
*
* PARAMETERS:
* data -- a pointer to a pointer to the key to delete
* On exit, *data points to the deleted data item
* (or NULL if deletion failed).
* rootp -- a pointer to an AVL tree
* compar -- name of function to compare 2 data items
*/
PRIVATE short avl_delete( data, rootp, compar )
void **data;
AVLtree *rootp;
int (*compar)();
{
short decrease;
int cmp;
AVLtree old_root = *rootp;
NODE nd_typ = node_type( *rootp );
DIRECTION dir = (nd_typ == IS_LBRANCH) ? LEFT : RIGHT;
if ( *rootp == NULL_TREE ) { /* data not found */
*data = NULL; /* set return value in data */
return HEIGHT_UNCHANGED;
}/* if */
/* NOTE the extra parameter to compare this time */
cmp = compar( *data, (*rootp) -> data, nd_typ ); /* compare data items */
if ( cmp < 0 ) { /* delete from left subtree */
decrease = -avl_delete( data, &( (*rootp) -> subtree[ LEFT ] ), compar );
if ( *data == NULL ) return HEIGHT_UNCHANGED;
}/* elif */
else if ( cmp > 0 ) { /* delete from right subtree */
decrease = avl_delete( data, &( (*rootp) -> subtree[ RIGHT ] ), compar );
if ( *data == NULL ) return HEIGHT_UNCHANGED;
}/* elif */
/************************************************************************
* At this point we know that if "cmp" is zero then "*rootp" points to
* the node that we need to delete. There are three cases:
*
* 1) The node is a leaf. Remove it and return.
*
* 2) The node is a branch (has only 1 child). Make "*rootp"
* (the pointer to this node) point to the child.
*
* 3) The node has two children. We swap data with the successor of
* "*rootp" (the smallest item in its right subtree) and delete
* the successor from the right subtree of "*rootp". The
* identifier "decrease" should be reset if the subtree height
* decreased due to the deletion of the sucessor of "rootp".
************************************************************************/
else { /* cmp == 0 */
*data = (*rootp) -> data; /* set return value in data */
switch ( nd_typ ) { /* what kind of node are we removing? */
case IS_LEAF :
free_node( rootp ); /* free the leaf, its height */
return HEIGHT_CHANGED; /* changes from 1 to 0, return 1 */
case IS_RBRANCH : /* only child becomes new root */
case IS_LBRANCH :
*rootp = (*rootp) -> subtree[ dir ];
free_node( &old_root ); /* free the deleted node */
return HEIGHT_CHANGED; /* we just shortened the "dir" subtree */
case IS_TREE :
decrease = avl_delete( &( (*rootp) -> data ),
&( (*rootp) -> subtree[ RIGHT ] ),
avl_min );
} /* switch */
} /* else */
(*rootp) -> bal -= decrease; /* update balance factor */
/**********************************************************************
* Rebalance if necessary -- the height of current tree changes if one
* of two things happens: (1) a rotation was performed which changed
* the height of the subtree (2) the subtree height decreased and now
* matches the height of its other subtree (so the current tree now
* has a zero balance when it previously did not).
**********************************************************************/
if ( decrease && (*rootp) -> bal ) /* return 1 if height */
return balance( rootp ); /* changed due to rotation */
else if ( decrease && !(*rootp) -> bal ) /* or if balance is 0 from */
return HEIGHT_CHANGED; /* height decrease of subtree */
else
return HEIGHT_UNCHANGED;
}/* avl_delete */
NOTICE how in the case of nd_typ == IS_TREE, I only need one statement. This
is due to the way avl_delete sets its parameters. The data pointer passed on
entrance points to the deleted node's data on exit. So I just delete the
minimal element of the right subtree, and steal its data as my-own (returning
my former data item on exit).
And there we have it, the mantainenance of AVL tree manipulations, the brunt
of which is covered in 5 routines, none of which (except for delete which
is less than 1-1/2pages) is greater than 1 normal page in length, including
comments (and there are a lot). The main routines are:
rotate_once(), rotate_twice(), balance(), avl_insert(), avl_delete().
All other routines are very small and auxillary in nature.