**Assignment 11: File Systems** # Learning Goals - Learn about inodes and (semi)-modern Unix file systems. - Implement file system functionality. # Grading Walk-Throughs This assignment will be graded as "Nailed It" / "Not Yet" by a TA. To pass ("Nailed It") the assignment, you must 1. Complete the assignment and submit your work to gradescope. + You should start this assignment in class on the day shown on the calendar. + *Complete the assignment as early as possible*. 1. Schedule a time to meet with a TA. You can meet with them after the submission deadline. + You must book a time to meet with a TA + Sign-up on the Google Sheet *with at least 36 hours of notice*. + Contact your TA on Slack after signing up. + All partners must meet with the TA. If you cannot all make it at the same time, then each of you needs to schedule a time to meet with the TAs. 1. Walk the TA through your solutions prior to the deadline. + Walk-throughs should take no more than 20 minutes. + You should be well prepared to walk a TA through your answers. + You may not make any significant corrections during the walk-through. You should plan on making corrections afterward and scheduling a new walk-through time. Mistakes are expected---nobody is perfect. + You must be prepared to explain your answers and justify your assumptions. TAs do not need to lead you to the correct answer during a walk-through---this is best left to a mentor session. 1. The TA will then either + mark your assignment as "Nailed It" on gradescope, or + mark your assignment as "Not Yet" and inform you that you have some corrections to make. 1. If corrections are needed, then you will need to complete them and then schedule a new time to meet with the TA. + You will ideally complete any needed revisions by the end of the day the following Monday. If you have concerns about the grading walk-through, you can meet with me after you have first met with a TA. # Overview For this assignment, your goal is to port (a fancy CS term for "convert" or "transfer") files from a faked file system to the native file system provided by the OS---you're given a bunch of data files that were generated by the faked file system, and your goal is to reconstruct from them the original files (two text files and an image file). As usual, you should complete this assignment with a partner; you may choose your partner for this assignment. Let me know if you want a random partner. Grab the starter code with: ~~~bash tar xvf /data/A11-FileSystems.tar ~~~ # Background The faked file system from which you will be porting files uses an indexed allocation with the following details: - The disk comprises 1024 256-byte blocks. - Block 0 is the "superblock", and it contains the file system parameters described here. - Block 1 is a bitmap that stores information about which inodes are currently in use. - Block 2 is a bitmap that stores information about which data blocks are currently in use. - Blocks 3 - 34 are inode blocks; each inode block stores an array of inodes. - Blocks 35 - 1024 are data blocks; each allocated data block stores the contents of some file (either a normal file or a directory file). inodes in this file system have the following properties: - A 16-byte filename. - An 8-byte file size (i.e., a `long`). - 8 direct pointers, each of which is a 4-byte value (i.e., an `int`) that stores a block number or 0. - 1 indirect pointer, an `int` representing a block number or 0. - 1 doubly-indirect pointer, an `int` representing a block number or 0. All indirect (and doubly-indirect) blocks simply store an array of `int` block numbers. Directory entries in this file system are 32-byte values composed of a 16-byte filename followed by a 4-byte inode number plus some padding. Directories are simply files where the contents of the file are an array of directory entries. If you are a bit shaky on this terminology, you can take a look at the [the file systems slides starting at page 20](https://cs.pomona.edu/classes/cs105/lectures/11-SystemIO/FileSystems.pdf). At this stage, you should answer the first few questions on gradescope. # Starter Code The starter code implements a generic block interface (via the `read_block` and `write_block` functions). It also provides some helper functions for copying inodes to/from memory and for parsing filepaths. # Your Tasks Your assignment is to port some files that are currently stored in the faked file system to the native file system provided by the operating system. This will be done as a series of five tasks. ## Task 1: File Path to inode Your first task is to compute an inode number given an absolute file path. For now, we'll make two simplifying assumptions: 1. you will only ever need to look for files that are stored in the root directory (all file paths will start with `/`, and 2. all files are small enough that they can be stored using only the direct pointers (i.e., both the indirect pointer and the doubly indirect pointer will always be 0). The main function includes a series of test cases, and your program will be able to pass the first test case after implementing this functionality. You will complete this task in two steps: ### Task 1a: implement `search_dir_data_block` This function takes three arguments: - a block number (`data_block_num`), - a filename (`filename`, and - an integer for the maximum number of entries to examine `max_entries`. This function should assume that the block specified by `data_block_num` is part of a directory and thus contains an array of directory entries. It should search the first `max_entries` directory entries in that block for one that matches the specified file name; if it finds the directory entry for that filename it should return the corresponding inode number for that file, otherwise it should return -1. Hint: you'll need to call `read_block`, which takes a pointer to a location in which to store a `BLOCK_SIZE` amount of data. You'll then need to loop over what you get back from `read_block`. ### Task 1b: implement `search_dir` This function takes the inode number associated with some directory and a filename. It should search the contents of that directory (through all datablocks in that directory) to try to find a data entry that matches the specified filename. If it finds such a directory entry it should return the corresponding inode number for that file, otherwise it should return -1. Remember that, for now, you can assume that all files (including this directory) use only direct pointers! The `search_dir` function relies on your `search_dir_data_block` function to look through each possible (direct) directory entry. Your starter code includes an initial implementation of a function `search_path` that parses a filepath (assuming it has the form `/filename.ext`) and calls your `search_dir` function with inode number 2 (the number of the inode for the root directory) and filename `filename.ext`. So as long as we make the assumption that you will only ever need to look for files that are stored in the root directory, `search_path` should return the inode number of the file with the specified path. If you try running the first test case, your code should now return the correct inode number for the file `/test.txt`. ## Task 2: Port Simple Files Your next task is to implement a simple version of the function `port_fake_to_real` that continues to make the two simplifying assumptions from Task 1 (files are in the root, and they are small enough to fit in one data block). This function will read the faked file system and save a copy to the native file system. This function takes two arguments: - `fake_filepath` is an absolute path in the faked file system - `real_filename` is a filename in which you will store the loaded data (this will create a "real" file on the server) Your implementation should: 1. Open a real file (`fopen`) 2. Grab an inode number from `search_path` for the fake file 3. Read the inode (`read_inode`) 4. Loop through the direct pointers, call `read_block` and store the data in the new file with `fputc` 5. Close the file Your program should now generate a real file named `test1.txt`. You should open this file and see if its contents make sense. ## Task 3: Handle Subdirectories We will now start eliminating our simplifying assumptions. For your third task, you will no longer assume that files are stored in the root directory; you now want to support files stored in subdirectories. To complete this task, you will need to modify the implementation of `search_path`. Your updated implementation will call `search_dir` on each subdirectory found in the path (all `num_dirs` of them), or until `search_dir` returns a `-1` indicating that the filename was not found. Your code should now return the correct inode number for the file `/example_dir/test2.txt` in the faked file system. This is checked by the third test. ## Task 4: Large Directories Tasks 4 and 5 up the difficulty quite a bit. We will discuss these a bit in class (or using some other means). But please give them a try before we do so. You will now *start* eliminating the second assumption. To support large directories, implement the functions `search_dir_indirect` and `search_dir_doubly_indirect`, and modify your implementation of `search_dir`. `search_dir_indirect` will use `search_dir_data_block` to examine each indirect entry. `search_dir_doubly_indirect` will use `search_dir_indirect` to examine each doubly indirect entry. `search_dir` will now need to call `search_dir_indirect` (and subsequently `search_dir_doubly_indirect`) when it does not receive a valid inode number during the direct case (or the subsequent indirect case). ## Task 5: Large Files Finally, you need to add support for large files. Modify your implementation of `port_fake_to_real` to support large files. Your updated implementation should 1. Search through indirect inodes (if needed) 2. Search through double indirect inodes (if needed) You should now be able to successfully port the file `/example_dir/image.jpg` from the faked file system and open it (you must download it to your local machine to open the image). # Submitting Your Assignment You will submit your code and/or responses on gradescope. **Only one partner should submit.** The submitter will add the other partner through the gradescope interface. To pass the autograder (if one exists for this assignment), your output must exactly match the expected output. Your program output should be similar to the example execution above, and the autograder on gradescope will show you the correct output if yours is not formatted properly. You can use [text-compare](https://text-compare.com/) to compare your output to the expected output and that should give you an idea if you have a misspelled word or extra space (or if I do). Additional details for using gradescope can be found here: - [Submitting an Assignment](https://help.gradescope.com/article/ccbpppziu9-student-submit-work) - [Adding Group Members](https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members) - [gradescope Student Help Center](https://help.gradescope.com/category/cyk4ij2dwi-student-workflow)