CS62 - Spring 2010 - Lecture 14

  • how's the course going so far?

  • midterm
       - average 53
       - high 68.5

  • look at new constructor in BinaryTree for empty tree
       - I've also changed the constructor for BinaryTree(E data)
       - why do we have empty trees?
          - put them in any place where we would normally have a null left or right child
          - what does this buy us?
             - look at height method in BinaryTree code
                - why return -1?
                   - 1+left.height(), and we don't want to count the "empty" nodes
                - by using an empty node, we can assume that all of the children of a non-empty node are non-null

  • searching in binary trees
       - look at search method in BinaryTree code
          - running time?
             - O(n)

  • binary search trees
       - we saw that search is O(n) time to find something in a tree. Can we do better?
       - binary search trees speed up the searching process by keeping the tree in a structured order. In particular:
          - left.data() < this.data() <= right.data()
       - what are the implications of this?
          - all items to the left are less than the value at the current node
          - all items to the right are greater than or equal to the value of the current node
       - how does this help us when it comes to searching?
          - at each node, we can compare the value we're looking for to the value at this node and we know which branch it's down!
          
          public boolean search(E item){
             if( isEmpty() ){
                return false;
             }
             else if( data.equals(item) ){
                return true;
             }else{
                if( item.compareTo(data) < 0 ){
                   return left.search(item);
                } else {
                   return right.search(item);
                }
             }
          }
       
       - notice that now rather than recursing on both the left and the right in our search method, we recurse on one or the other
       - has a similar feel to binary search
       - what is the running time?
          - O(h)
          - when is this a good running time and when is this a bad running time?
             - when the tree is full (or near full) then O(h) = O(log n)
             - when the tree is a twig (or near a twig) then O(h) = O(n) so we haven't gained anything
       - some more methods
          - How can we find the minimum value in the tree?
             - left-most value in the tree
             - running time? O(h)
          - Max?
             - right-most value in the tree
             - running time? O(h)
          - traversal: what kind of tree traversal would make sense?
             - in-order
                - visit nodes to the left first
                - then visit this node
                - finally, visit nodes to the right
             - in-order traversal will print them in sorted order
          - successor and predecessor
             - sometimes we may want to know where the predecessor or successor is in the tree, that is, the previous or next in the sequence of data
             - the simple case:
                - predecessor is the right-most node of the left subtree, i.e. the largest node of all of the elements that are less than a node
                - successor is the left-most node of the right sub-tree, i.e. the smallest node of all of the elements that are larger than a node
             - complications: what if a node doesn't have a left or right subtree?
                - it could be the max or the min, in which case, it might not have a success or predecessor
                - successor: what if there is no right subtree?
                   - let x be a node with no right subtree
                   - let y = successor(x)
                   - we know that predecessor(y) = x
                   - to find the predecessor of y, we'd look at the right most node of the left subtree
                   - so, to find the successor of x
                      - keep going up as long as we're a right child
                      - when we're not a right child anymore, then that's the successor of x
                - predecessor is similar
             - what are the running time's of predecessor and successor? O(h)
       - inserting into a binary tree
          - assuming no duplicates, how can we insert into a binary tree?
             - always add to a leaf node
             - traverse down to some leaf node similar to search
             - add a node to the left or right depending on whether it is larger than or smaller than the leaf
          - running time? O(h)
       - deleting from a binary tree
          - let's say you're given a node (i.e. a BinaryTree) and you want to delete it
          - 3 cases:
             - it is a leaf: just delete it and set that child of the parent's to an empty node
             - it only has one child: splice it out
             - it has two children: replace x with it's successor in the list!
                - we know the successor is a leaf because we have a left subtree, so it's easy to remove
                - we know the successor is larger than anything in our left subtree (it's the largest of the left subtree)
                - we know the successor is smaller than anything in our right subtree (it's smaller than x)
          - running time? O(h)
       - we're starting to see a recurring trend that most algorithms are bounded by the height of the binary search tree
          - what is the worse case height?
             - O(n) the twig
             - when does this happen?
                - insert elements in sorted or reverse sorted order
          - what is the best case height?
             - O(log_2 n)
             - when it's a full tree
          - Randomized BST: the expected height of a randomly built binary search tree is O(log n), i.e. a tree where the values inserted are randomly selected
             - this is only useful if we know before hand all of the data we'll be inserting
       - does this give you an idea for a sorting algorithm?
          - randomly insert the data into a binary search tree
          - in-order traversal of the tree
          - running time
             - best-case: O(n log n)
             - worse-case: O(n^2) - we could still get unlucky
             - average-case: O(n log n)

  • balanced trees
       - even randomized trees still don't give us guaranteed best-case O(log n) height on the tree
       - however, there are approaches that can guarantee this by making sure the tree doesn't become too "unbalanced"
          - AVL trees
          - red-black tress
          - B-trees (used in databases and for "on-disk" trees)

  • red-black trees
       - a binary search tree with additional constraints
          - a binary search tree
          - each node is also labeled with a color, red or black
          - all empty nodes are black (i.e. children of leaves)
          - the root is always black (this isn't technically required, but doesn't hurt us and makes our life easier)
          - all red nodes have two black children
          - for a given node, the number of black nodes on any path from that node to any leaf is the same
       - how does this guarantee us our height?
          - what is the shortest possible path from the root to any leaf?
             - all black nodes
          - what is the longest possible path from the root to any leaf?
             - alternating red and black nodes (since a red node has to have two black children)
          - what is the biggest difference between the longest and shortest path?
             - since all paths must have the same number of black nodes, the longest path can be at most twice as long
          - the tree can be no more than an order of 2 imbalanced, which will still guarantee us O(log n) height, since 2 is just a constant multiplier
       - insertion into a red-black tree
          - we insert as normal into the binary tree at a leaf
          - we color the node inserted as red
          - then we need to fix up the tree to make it maintain the constraints
             - like delete for normal BSTs, there are a number of cases, with some more complicated than others
             - beyond the scope of this class, but they utilize "rotations" of the tree to alter the structure
          - rotations:
             - basic idea is to rotate the child up into the parent position and then give the child on the side of the rotation to the old parent
             - left-rotation
                - x with left subtree alpha and right subtree y with left subtree beta and right subtree gamma
                - becomes: y with right subtree gamma and left subtree x with left subtree alpha and right subtree beta
             - right rotation is in the opposite direction
             - how might this help us?
                - insert: 1, 2, 3 into the tree
                   - inserting 1 and 2 is fine
                   - after inserting 3, we have a twig
                   - if we rotate left, it looks more like a balanced tree
       - look at demo: http://www.ece.uc.edu/~franco/C321/html/RedBlack/rb.orig.html

  • n-ary trees