Recitation notes for 11/5/97 Plan: finish up any leftovers, lingering questions, etc., from this section of the course. E.g., hashing, balanced search trees, recurrences =========================================================================== * Do examples of inserting some things into a 2-3 tree =========================================================================== An example from an old final exam: fill in code to determine if a binary tree has the AVL property. Let's have class help us write it. This code will return the height of the tree if it IS an AVL tree and will return -1 if it ISN'T an AVL tree. (Defining the output this way will make the code simpler) (You could also imagine writing this as a member function instead) // We return the height if tree has the AVL property and -1 if tree doesn't int avl(Tree *t) { if (t == NULL) return 0; int hleft = avl(t->left); int hright = avl(t->right); if (hleft == -1 || hright == -1) return -1; if (hright < hleft-1 || hleft < hright-1) return -1; return 1 + max(hright, hleft); } What's the running time? O(n) [this was not on the final, but is a good question] Suppose we got rid of the local variables hleft and hright and we instead just made explicit calls to "avl" in the if-statements and in the return. Write a recurrence for the running time if the tree turns out to be a complete binary tree of N nodes: T(N) = 4*T(N/2) + 4*T(N/2) + const = 8*T(N/2) + const. solves to O(8^log_2(N)) = O(N^3) (Actually, for AVL trees, we store a field in each node that tells us about how it is balanced....) ======================================================================== Another interesting piece of code to write is printInRange. Given two values X and Y, this prints out all the values in the tree, in order, that are between X and Y inclusive. Let's do it as a member function. Let's say values in the tree are integers. void Tree::printInRange(int x, int y) { if (x > value) { if (right) right->printInRange(x, y); return; } if (y < value) { if (left) left->printInRange(x, y); return; } if (left) left->printInRange(x, y); cout << value << endl; if (right) right->printInRange(x, y); } if tree has depth log n, and we print out k numbers, then the total time is O(k + log n). ========================== Two more general topics to point out (but no need to go into in depth) B-trees are trees much like 2-3 trees. Each non-leaf node has > m/2 children and no more than m children (where m is a parameter of the tree). All of the leaves are at the same level. If m is 3, this is a 2-3 tree. (Seth mentioned this in passing, but I don't think anyone understood what 2-3 trees were yet.) B-trees are a common enough data structure that the students should at least have an idea of what they are. Search trees are very common in databases, as the indices to support queries. Many queries are of the form find all employess with salary between 20000 and 30000. Most industrial databases use their own proprietary refinements on a b-tree for indexing. One of the interesting problems in designing algorithms for databases is that you need to minimize the number of nodes that are accessed and particularly the number that change. Doing a lot of rearrangement with a few nodes is much cheaper than a little rearrangement across many nodes. B-trees with a relatively large m (20-100) accomplish this. goal. D ========================== Deletion from 2-3 trees I don't expect to actually go over this, but since there was a question in class, we can put it in the notes and mention it the class. Overview: Easiest case: the value being deleted is in a leaf node and that node has two keys in it. Just remove the key. < 60 70 > / | \ < 55 58 > < 65> < 75 80 > removing the 58 is easy Next case: deleting the last key from a leaf node if the left (or right) neighbor has two keys, shift its largest (smallest) key up and bring the left (right) key down to this node < 60 70 > / | \ < 55 > < 65> < 75 80 > removing the 65 goes to < 60 75 > / | \ < 55 > < 70 > < 80 > otherwise if there are three children of the immediate parent reduce to two children and drop one of the parent keys into the appropriate child node < 75 > / \ < 60 70 > < 80 > After removing the 55 Otherwise (there are two children and both have only one key) you need to do major surgery to reduce the height of the tree by one. The other sibling of the child with the key is combined with its parent, making that a single level. You must then shift values from the other children of the root up or over. < 40 > / \ < 20 > < 75 > / \ / \ <10> <30> < 60 70 > < 80 > removing 30 gives < 40 75 > / | \ <10 20> <60 70> <80> If you remove a key from a non leaf node, find the next highest (or lowest) value in the tree, remove it from its node (it will be in a leaf) and replace the key you are removing with the next highest just removed. removing 40 gives < 60 75 > / | \ <10 20> <70> <80>