File 3 (B-tree)

This file contains the B-tree records and multiple occurrence tables for support of keyed attributes. The header is shown in table .

**Table:** File 3 header
$\begin{table}\begin{record} 1& KDBHDR& Z& a string containing \lq\lq ''RIM DATABASE'... ...he remainder of the first page is filled with zeros.\\ \end{record} \end{table}$

It contains some identification information and pointers for the other pages. All actual B-tree data begins in other records.

The records on file 3, with the exception of the first record, are either B-tree node records or multiple occurrence table records. Each attribute in each relation which is a `key' attribute has its own set of B-tree node records. All key attributes share space on the multiple occurrence table records.

The tree structure used in RIM is similar to other B-tree structures but has several unique features which make it ideal for use in RIM. For each key attribute, the value of "ATTKEY" in the attribute table of file 1 has the record number of the top or starting B-tree node record on file 3. Each node record has pointers which point to either other node records or to relation rows on file 2. Each node record is actually a two dimensional array with 3 rows and 42 columns. The length of records on file 3 must be a value which can be divided evenly by 3.

A B-tree node as an array with 3 rows and 42 columns has information for a maximum of 42 values i.e., the order of B-tree is 42. The first row has the actual key values which are always stored in increasing order. This value is one word in length and contains the standard machine-dependent binary representation for integer and real attributes, and only the first word for double precision and text attributes. The second row has pointers which can represent one of three kinds of values. If this value is negative, its absolute value is the number of the next record as one traverses down a B-tree. If the value in this second row is positive, its meaning depends on whether the value in row three is zero or not. If the value in row 3 is zero, then row 2 has a packed number with the record number and word offset for the file 2 page with the row which corresponds with this key value. The value for the key attribute in this row of file 2 page matches the key value found in row 1 of the B-tree node. If the value in row 3 is not zero, then the value in row 2 has a packed number (with record number and word offset) which points to the end of the multiple occurrence table for this key value and row 3 has a packed number which points to the start of the multiple occurrence table for this key value.

The multiple occurrence table records are made up of linked lists of pointer pairs. Each pair has two pointers, the first of which points to the next pointer pair, and the second of which points to data on file 2. The pointers are packed values with record numbers and word offsets. The end of the link list is marked by zero in the "next" pointer field.

By having pointers to the start and end of the multiple occurrence list means that adding another occurrence of a key value does not involve the overhead of traversing the entire linked list. It is just as easy to add the fifteenth occurrence as it is the fifth. A record size of 126 allows for 63 pointer pairs on each multiple occurrence record.

When scanning a B-tree node the values are always in increasing order. This allows us to scan a node without looking `two ways' at each value. The scan stop when we get the the special flag for the end of the list which is ZTEND. The node records are either nodes with all columns having node pointers or all columns having multiple occurrence and file 2 pointers. Nodes with multiple occurrence and file 2 pointers are sometimes called leaf nodes.