C/C++ Interactive Guide

home *** CD-ROM | disk | FTP | other *** search

/ C/C++ Interactive Guide / c-cplusplus-interactive-guide.iso / c_ref / csource3 / 155_01 / readme < prev next >

Wrap

Text File | 1987-10-09 | 16KB | 396 lines

B-Tree Library Developed by Ray Swartz The programming routines that make up this b-tree library are hereby put into the public domain. Any use, either private or commercial, is allowed upon the condition that a notice appear within the program documentation or on the terminal screen (during execution of the program) that these b-tree routines were developed by Ray Swartz. Such notice must be clear and obvious, appearing at the beginning of written documentation or displayed on the terminal screen during the initial phases of the program. Use of these routines without proper attribution is not allowed. This violates the spirit within which these routines were developed and put into the public domain. The author reserves the right to charge royalties of 10% of the gross revenue of any computer program using these routines WITHOUT the required notice. These routines create and manage a balanced binary tree. The advantages of these trees are well-known and will not be described here. However, for an excellent discussion of this, and a number of related topics, see D. Knuth, The Art of Computer Programming - Volume 3 (Sorting and Searching) pages 422 - 471. This book is published by Addison-Wesley, ISBN: 0-201-03803-X. The keyfile created to do this has two distinct parts. The first part holds the header information on the keyfile itself. This takes up the first 19 (DATA_LENGTH) bytes of the file and are arranged as follows: 1) A Tilde (~) - used to denote a keyfile 2) The length of this file's key field (2-byte integer (ascii)) 3) Node number of next available key node (5-byte integer (ascii)) 4) Node number of first node in list (5-byte integer (ascii)) 5) Number of non-deleted nodes in the list (5-byte integer (ascii)) 6) A Tilde This information is stored in the structure KEYINFO which is declared in the file BTREE.H. The second part of a keyfile is the key nodes. Each node holds: 1) A delete flag (YES if deleted, NO if active) (2 bytes) 2) A balance factor (-1 if unbalanced to left, 0 if balanced, 1 if unbalanced to right). (2 bytes) 3) The node number just to the left of this one. (5 bytes) 4) The node number just to the right of this one. (5 bytes) 5) The record number of the data containing this key. (5 bytes) 6) The key this node references (length depends on keyfile initialization) This information is stored in the structure NODE which is declared in the file BTREE.H. Before information can be stored in a keyfile, the file must be initialized by the TREEINIT program included with this library. The user must declare the length of the keys this file will hold. This length is a maximum that must remain constant. Changing the size of the key requires rebuilding the tree stored in the keyfile. No provisions are made for managing the data records referenced by these keys. This is the responsibility of the programmer using these routines. The heart of this system is the INSERT routine which not only inserts new keys into the tree but maintains the tree's balance as well. This routine, in it entirety, was implemented by following the algorithm in Knuth, The Art of Computer Programming Volume 3, Pages 455 - 457. No attempt should be made to modify the INSERT function without a thorough understanding of this material. The b-tree library consists of 3 program files: BTREE1.C BTREE0.C TERMCTRL.C. The file BTREE1.C contains the INSERT, DELETE_KEY, GET_NEXT, GET_PREVIOUS, and FIND_KEY functions. BTREE0.C contains support functions called by BTREE1.C. TERMCTRL.C contains all the routines that use functions that directly manipulate the terminal. Generally, these terminal control functions are not required for proper usage, however the routines will not work without them. They are included for 2 reasons. First, when the routines were written, I integrated them into the program. Removing them would be a pain. Second, once the appropriate changes are made to the CURSOR routine (addressing the cursor) and the CLEAR_END_OF_PAGE macro (in BTREE.H), an elementary screen driver is enabled for whatever terminal is used. To create a linking library, these files should be organized so that BTREE1 can reference routines in BTREE0 and TERMCTRL and those in BTREE0 can reference routines in TERMCTRL. Inserting keys into a balanced binary tree is a straight forward process. Deleting keys from such a tree and maintaining the balance is another thing entirely. In fact, deleting tree nodes may require several adjustments to maintain tree balance. In an attempt to avoid all the complexities introduced by performing true node deletion, I opted for a conceptually simpler approach. Instead of actually deleting nodes from the tree, I mark them as deleted without removing them from the tree. In this way, the tree's structure never changes except during insertion. A "deleted" node remains in the tree except it can't be "found." Like all shortcuts, this one has a cost. As more and more nodes are deleted, the search to find an "active" node takes longer since an increase number of nodes are being read but "ignored." However, searching a binary tree is quite efficient. Even if 10% of a tree was deleted, the search would not take much longer in real time. The obvious solution is to re-build the tree when the performance slows unacceptability due to an overabundance of deleted nodes. This set of routines uses three tricks. The first one is relying on someone else's algorithm to design the insertion function. The second one is to mark deleted nodes in place of actually removing them from the tree. The last one is the use of a stack to find the next and previous nodes from the current one. I call this a trick because how it works is not obvious. The problem is this: Given that we have searched for and found a specific node, how do we get to the key that is just below (previous) or just above (next) this one? Consider the balanced binary tree shown below that has 15 nodes - all active: 8 / \ / \ 4 12 / \ / \ 2 6 10 14 / \ / \ / \ / \ 1 3 5 7 9 11 13 15 Suppose node 12 is the current node and we want to find the next larger key. That key is the one at the end of the left branch of the node one to the right of the current node - in this case, node 13. The exact opposite is true of the key that is the next smaller or previous node. It is the one at the end of the right branch of the node one to the left of the current node - in this case node 11. This is the easy possibility. What happens if we go the the next one - making node 13 the current node - and now want to know what the next node is? The problem is that node 13 is the end of a tree branch and we can't go further "down" the tree. Instead, we must go "up" a node to 14. How do we keep treck of which node is up one from here? We do this by building a stack as we look through the tree. In reality, we use two identical stacks - a right one and a left one. In the right stack we "push" values each time we move down and to the right in the tree. We do the same on the left stack when we move down and to the left. As an example, to search for node 13, we begin at the head of the list (node 8) and move to the right (since 13 > 8). After moving right, we push 8 onto the right stack. At 12 we again move right (13 > 12) to node 14. We then push 12 onto the right stack. At 14 we move left (13 < 14). This time we push 14 onto the left stack since we moved left here. The search now terminates since we are at node 13. The stacks look like: Left Right 14 8 12