CSCI 256
|
This week, your assignment is to write a program that uses the efficient union-find algorithm discussed in class to compute the connected components of a graph. You may program in C, C++, Java, Pascal, or ML and your program may run on one of the CS lab Macs or Dells. Other languages are possible, but check with me first. Any language used must support arrays, since that is a necessary part of the efficient union-find algorithm.
The input to the program will be from a text file named graph.dat. The first line contains the number, v, of vertices in the graph. (The vertices themselves will be numbered from 0 to v-1.) The next line contains the number, e, of edges in the graph. After this comes 2e lines containing one vertex per line. Lines 1 and 2 contain the endpoints of the first edge, lines 3 and 4 the endpoints of the second edge, etc.
Here is some sample input:
6 5 0 1 0 5 5 3 4 2 3 0The file has 6 vertices (numbered 0 to 5) and 5 edges. The output from this file should indicate that vertices 0, 1, 3, and 5 are in one connected component, and that 2 and 4 are in the only other one.
Begin by setting up a singleton set for each vertex in the graph. If you are using a language that requires a static bound on the size of the array of vertices, you may assume there will be no more than 50 vertices in any graph input. Otherwise create an array of the proper size after reading the first line of the input. Each time an edge is read in, take the union of the corresponding sets. At the end of the program, print out the connected components in a readable fashion.
Important: I will be using my own (secret) data file to test your program so be sure to follow the input conventions above very carefully!
Be sure to use the most efficient version of union-find discussed in class and the text. Debug it carefully, as you will be using union-find later in the semester on other graph algorithms.
If you use a Mac, turn in a project as well as the code to the CS256 drop off folder. Make sure the file name graph.dat is hard-coded in your program or pop up a dialog box so the user can choose it.
If you use a Dell computer, you should also create an executable file called "run" which, when executed, will set up any path names (e.g., CLASSPATH) necessary and then execute the program. Those of you who used UNIX in cs136 might want to use source cs136 to set up path names appropriately. Be sure to test your program to make sure that it will work from any folder! As with the Macs, the data file name graph.dat should either be hard-coded into your program or be choosable from a dialog box. The data file will be found in the same folder as your program. The turnin program is most easily invoked by writing
/usr/cs-local/bin/turnin progdir.tar(you can leave off the "/usr/cs-local/bin" if it is in your path) where progdir.tar is a tar file created from the directory with your program (and "run") file. You can create a tar file by typing
tar cvf progdir.tar directory-namewhere directory-name is the name of the directory, and something.tar is the name you would like to save it with (use something helpful, such as including the assignment number). The general syntax of the turnin command is:
turnin [-c course-number] [-v] [-d directory] files -c hand files in under course-number set environment variable TURNIN_COURSE for default. -d use directory as source file directory. -v hand in files with verbose output
If you are using Java, I suggest you use a BufferedReader, ReadLine, and Integer.parseInt(..) to read in data. Output the answer (in a nice format) using System.out.println(...).
However you set things up, please make it as easy as possible to run. There are roughly 25 of you, and if it takes as much as 5 minutes to run your program it becomes a job of over 2 hours just to run the programs without even looking at the code.
Back to:
kim@cs.williams.edu