**Binary Trees**
*Oct 20*
# Administrivia
- On-disk sort due today
- Midterm 2 review next Tuesday
- Midterm 2 next Wednesday (same format)
# The Story So Far
We first discussed basic data structures:
- Arrays
- Resizing arrays (ArrayLists)
- LinkedLists
- Queues
- Stacks
We looked at the running time and memory usage for each data structure.
We then discussed sorting:
- Selection sort
- Insertion sort
- Mergesort
- Quicksort
We looked at the number of comparisons and swaps, stability, and memory usage (in-place or not) for each algorithm.
Along the way we also discussed comparisons (the `Comparable` and `Comparator` interfaces) and iteration (the `Iterable` interface).
# Binary Trees
Store elements hierarchically rather than linearly. For example, consider your computer's file system--all files are stored in directories and sub-directories, with your user directory being at the root of your user files.
A **tree** is a set of nodes that store elements based on a parent-child relationship. You can think of it as a linked list where each node as 0 or more `next` nodes.
Some helpful definitions:
Root
: node at the base (top) of the tree (similar to the head node of a linked list)
Parent
: of a specific node is the node directly above another node
Child
: of a specific node is the node directly below another node
Leaf
: node without any children
Height
: of a tree is the length of the longest path from the root to a leaf
Level
: the root is at level 0, all other nodes at at a level $= L_{parent} + 1$ (aka depth)
Full
: a tree where every node has 0 or the maximum number of possible children
k-Ary
: a tree in which each node has a maximum of k children
Complete
: a k-ary tree in which all leaves have the same depth and all internal nodes have degree k.
A **binary tree** is a tree where every node can have 0, 1, or **2** children.
![Binary Trees](images/2020-10-20-BinaryTrees.jpg)
# Binary Tree Code
The code below is also found [here on GitHub](https://github.com/pomonacs622020fa/LectureCode/blob/master/BinaryTrees/BST.java).
~~~java linenumbers
public class BinaryTree- {
private class Node {
private Item item;
private Node left;
private Node right;
public Node(Item item, Node left, Node right) {
this.item = item;
this.left = left;
this.right = right;
}
public Node(Item item) {
this(item, null, null);
}
}
private Node root;
}
~~~
# Tree Traversals
- Pre-order
- In-order
- Post-order
![Binary Tree Traversals](images/2020-10-20-Traversals.jpg)
# Binary Search Trees (BSTs)
[Code for BST](https://github.com/pomonacs622020fa/LectureCode/blob/master/BinaryTrees/BST.java).
A binary tree where items appears in a particular "order".
Binary search tree property:
1. Each node has a `Comparable` **key** and a **value** with no restrictions
2. All keys smaller to node's key appear to its left
3. All keys larger (or equal) to a node's key appear to its right
![Binary Search Trees](images/2020-10-20-BinarySearchTrees.jpg)
Take a careful look at the BST `put` and `delete` methods. What is the running time? To help, consider the case when you insert these nodes in order: `1, 2, 3, 4, 5` (assume that the keys and values are the same).
**********************
* .-.
* | 1 |
* +-+
* \
* .+.
* | 2 |
* +-+
* \
* .+.
* | 3 |
* +-+
* \
* .+.
* | 4 |
* +-+
* \
* .+.
* | 5 |
* '-'
**********************
The number of nodes in the tree is typically denoted with $n$.
What is the running time of a searching for the `5`?
**Answer**: $O(n)$
*For an unbalanced tree--a tree where the left and right subtrees have drastically different heights--the running time of nearly all operations becomes **linear***.
Now consider this example:
*****************
* .-.
* | 2 |
* +-+
* / \
* .+. .+.
* | 1 | | 4 |
* '-' +-+
* / \
* .+. .+.
* | 3 | | 5 |
* '-' '-'
*****************
What is the running time of finding the `5`? We no longer need to look at all nodes. We
1. Start at the root (`2`) and go right
2. Go right from the `4`
3. Find the `5`
We are eliminating **half** of the tree each time. Thus the running time for searching a balanced tree is $O(log_2(n))$.
# BST Code Walk-Through
![Walk-Through](images/2020-10-20-WalkThrough.jpg)
# Balanced Trees
A number of methods exist for keeping the tree *balanced*. Balanced means that the height of the tree is $O(log_2(n))$--and thus, most operations take $O(log_2(n))$.
**Your are not required to know how balancing works, just that it is possible and why it is useful.**
- [Red-black trees](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree)
- [AVL trees](https://en.wikipedia.org/wiki/AVL_tree)
- [2-3 trees](https://en.wikipedia.org/wiki/2%E2%80%933_tree) and [slides](lecture24-balanced_search_trees.pdf)
- [B-trees](https://en.wikipedia.org/wiki/B-tree)
- [Splay trees](https://en.wikipedia.org/wiki/Splay_tree)
- [etc.](https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree)