The load factor of a table is defined as a = (number of elts in table) / (size of table)
a = 1 means the table is full, a = 0 means it is empty.
Larger values of a lead to more collisions.
(Note that with external chaining, it is possible to have a > 1).
The following table summarizes the performance of our collision resolution techniques in searching for an element. The value in each slot represents the average number of compares necessary for the search. The first column represents the number of compares if the search is ultimately unsuccessful, while the second represents the case when the item is found:
Strategy | Unsuccessful | Successful |
---|---|---|
Linear rehashing | 1/2 (1+ 1/(1-a)2) | 1/2 (1+ 1/(1-a)) |
Double hashing | 1/(1-a) | - (1/a) x log(1-a) |
External hashing | a+ea | 1 + 1/2 a |
Double hashing is similar, but not so bad, whereas external increases not very rapidly at all (linearly).
In particular, if a = .9, we get
Strategy | Unsuccessful | Successful |
---|---|---|
Linear rehashing | 55 | 11/2 |
Double hashing | 10 | ~ 4 |
External hashing | 3 | 1.45 |
The space requirements (in words of memory) are roughly the same for both techniques:
General rule of thumb: small elts, small load factor, use open addressing.
If large elts then external chaining gives good performance at a small cost in space.
public interface Dictionary<K,V> { /** Returns the number of entries in the dictionary. */ public int size(); /** Returns whether the dictionary is empty. */ public boolean isEmpty(); /** Returns an entry containing the given key, or null if * no such entry exists. */ public Entry<K,V> find(K key) throws InvalidKeyException; /** Returns an iterator containing all the entries containing the * given key, or an empty iterator if no such entries exist. */ public Iterable<Entry<K,V>> findAll(K key) throws InvalidKeyException; /** Inserts an item into the dictionary. Returns the newly created * entry. */ public Entry<K,V> insert(K key, V value) throws InvalidKeyException; /** Removes and returns the given entry from the dictionary. */ public Entry<K,V> remove(Entry<K,V> e) throws InvalidEntryException; /** Returns an iterator containing all the entries in the dictionary. */ public Iterable<Entry<K,V>> entries(); }Hash tables make all of these operations simple except entries() as you will need to search through the whole table to find the non-null entries. Therefore O(N) where N is the size of the table. All others should be O(1), but can be as bad as O(n) if you have a bad hash code.
Definition: A binary tree is a binary search tree iff it is empty or if the value of every node is both greater than or equal to every value in its left subtree and less than or equal to every value in its right subtree.
Innterestingly, it will simplify a number of algorithms if we represent binary search trees where all external nodes hold null. Thus an empty binary search tree has null at the root.
Searching a binary tree is straightforward. See the code in BinarySearchTree. See the protected method treeSearch, which is used to find, insert, and remove elements of the binary search tree. Notice that find and insert have complexity proportional to the height of the tree, which may be as good as log n or as bad as n, where n is the number of elements in the tree.
Removing an element from a tree is a bit tricky. Removal algorithm depends on algorithm removeExternal(v) where v is an external node (with a null value). The algorithm remove v and its parent and replaces the parent by v's sibling (which might be null).
Removing an element in node w is done by cases: