CSCI 256
|
Only turn in problems from the second section.
Problem 34.4-1 on page 875 of the text. Compute the pi function for ababbabbababbababbabb.
i | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P[i] | a | b | a | b | b | a | b | b | a | b | a | b | b | a | b | a | b | b | a | b | b |
pi[i] | 0 | 0 | 1 | 2 | 0 | 1 | 2 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 3 | 4 | 5 | 6 | 7 | 8 |
Notice how the mismatch of the 16th character with the character in pi[15]+1 does not result in going all the way back to zero. Instead try matching starting from pi[pi[15]] = 2 (and find 3rd character matches 16th).
Problem 34.4-5 on page 876 of the text. Give a linear time algorithm to determine if a text T is a cyclic rotation of another string T'.
Solution: First make sure T and T' have the same length n. If so, concatenate T with itself to form the string T2 with length 2n. Now use KMP to determine if pattern T' occurs in T2. If so, the T is a cyclic rotation of T' (and vice-versa).
On-line string matching algorithm: Suppose that the pattern is input one character at a time at a relatively slow pace, but the text is already given. We would like to proceed with the matching as much as we can, without waiting until all the pattern is known. In other words, just before the kth character is input, we would like to be at the first place in the text that matches the first k-1 characters in the pattern. Modify the KMP algorithm to achieve that goal. Note that this is what the emacs "find" (control-S) function does.
Solution: Construct the pi table for the part of the pattern that is known. Then extend it every time a new character is typed. This just involves replacing the for loop in ComputePrefix by a while loop that waits for input (or more likely putting the computation of pi into the KMP algorithm).
Implement Kruskal's algorithm. Use the efficient implementation of Union-Find from Assignment 3. The input for the graph will be named graph.dat and have exactly the same format as for Assignment 3. Rules for turning in your program are exactly the same as for Assignment 3. Be sure to include a "run" file if your program runs under UNIX. You may use existing libraries to represent graphs.
Problem 34.4-3 on page 876 of the text.
Solution: Compute the pi function for PT. If P has length m, look for all i where m is in pi*[i] (recall that pi*[i] includes all k such that pk is a suffix of pi). Thus those are the places where a suffix of PTi exactly matches the prefix of length m. Because P has length m, these are the places where P matches a suffix of PTi. Thus report i-2m as the appropriate shift where P can be found in T (subtract one m to find the character before the match and subtract another m to get rid of P at the beginning of PT).
Back to:
kim@cs.williams.edu