**Assignment 09: Virtual Memory** # Learning Goals - Understand how virtual memory is implement by an operating system. - Implement the translation process from a virtual to a physical memory address # Grading Walk-Throughs This assignment will be graded as "Nailed It" / "Not Yet" by a TA. To pass ("Nailed It") the assignment, you must 1. Complete the assignment and submit your work to gradescope. + You should start this assignment in class on the day shown on the calendar. + *Complete the assignment as early as possible*. 1. Schedule a time to meet with a TA. You can meet with them after the submission deadline. + You must book a time to meet with a TA + Sign-up on the Google Sheet *with at least 36 hours of notice*. + Contact your TA on Slack after signing up. + All partners must meet with the TA. If you cannot all make it at the same time, then each of you needs to schedule a time to meet with the TAs. 1. Walk the TA through your solutions prior to the deadline. + Walk-throughs should take no more than 20 minutes. + You should be well prepared to walk a TA through your answers. + You may not make any significant corrections during the walk-through. You should plan on making corrections afterward and scheduling a new walk-through time. Mistakes are expected---nobody is perfect. + You must be prepared to explain your answers and justify your assumptions. TAs do not need to lead you to the correct answer during a walk-through---this is best left to a mentor session. 1. The TA will then either + mark your assignment as "Nailed It" on gradescope, or + mark your assignment as "Not Yet" and inform you that you have some corrections to make. 1. If corrections are needed, then you will need to complete them and then schedule a new time to meet with the TA. + You will ideally complete any needed revisions by the end of the day the following Monday. If you have concerns about the grading walk-through, you can meet with me after you have first met with a TA. # Overview For this assignment, you will **emulate** virtual memory in software at the user level (as opposed to what the kernel does when it interacts with hardware). As usual, you should complete this assignment with a partner, and you may choose your partner. Starter code for this assignment is available on the course server. ~~~bash cd ~/cs105/assignments/ tar xvf /data/A09-VirtualMemory.tar ~~~ # Background In virtual memory, a virtual address space is divided into pages of a set size. Some pages are stored in physical memory frames and others are waiting in secondary storage. A [page fault](https://en.wikipedia.org/wiki/Page_fault) occurs when a process attempts to access data that is not yet in physical memory. The operating system kernel handles page faults by 1. identifying the missing page, 2. (if necessary) evicting some other page from physical memory, 3. reading the missing page from secondary storage and storing it in the newly available physical memory frame, 4. updating the process's page table, and 5. completing the memory access that resulted in the page fault. # Starter Code Remember that this is an emulated virtual memory system. All of your code will run in user-space using the stack, heap, and data segments of your `virtmem` program. The starter code contains most of what is needed to emulate virtual memory. Physical memory is implemented as an array of bytes; the address of this array is stored in the global variable `byte *physical_memory`. The *physical address* of a value is the index of (the start of) that value in the array `physical_memory`. The page table (`page_table_entry_t *page_table`) is stored on the heap as an array of `page_table_entry_t` values (page table entries). The type `page_table_entry_t` is defined at the top of the file; it contains a frame number and a valid bit (one of your tasks is to add a member variable for storing a usage marker). For simplicity, there is no permission (read, write, or execute) information for each page. If the valid bit is set, then that virtual page is stored in physical memory (i.e., in the array `physical_memory`). If the valid bit is not set, then that virtual page is currently paged out to "secondary storage"; in our emulation that means the virtual page is stored in a file located in the directory `files` with a filename given by the page number. You can safely assume that all pages are initialized. You can also assume the only thing stored in virtual memory is an array of random integers (we will ignore that real virtual memory also includes code, global variables, stack frames, etc.). This random array of integers is generated by the function `gen_array` and stored in our emulated virtual memory system by the function `initialize`. The function `handle_page_fault` emulates the kernel code for handling page faults. It 1. identifies the missing page, 2. evicts some other page from memory, 3. reads the missing page from secondary storage, and 4. updates the emulated page table. The wrapper functions `load` and `store` emulate memory accesses that call the page fault handler code when necessary. These two functions are analogous to accessing a piece of array data using brackets (i.e., `x = array[45]` is similar to `load(virtual_address)`). The function `print_simulation_state` prints the current state of the page table and of virtual memory. This might be helpful for debugging purposes. However, I strongly recommend that you rely on `gdb` instead. For fun, the starter code includes implementations of heap sort and quicksort, which are used to evaluate your implementation of virtual memory. The function `print_array` prints the current state of the array (that you are in the process of sorting). **Hint:** you might find this function useful for testing your solution, as most bugs will result in an incorrectly sorted array, or a segfault. One more thing! You'll notice this code at the top of your file: ~~~c // Global constants defining VM characteristics // I recommend testing with // const int PAGE_SIZE = 16; // const int PHYSICAL_MEMORY_SIZE = 64; // const int VIRTUAL_MEMORY_SIZE = 128; const int PAGE_SIZE = 4096; // 4 KB = 2^12 bytes const int PHYSICAL_MEMORY_SIZE = 32768; // 32 KB = 2^15 bytes const int VIRTUAL_MEMORY_SIZE = 1048576; // 1 MB = 2^20 bytes ~~~ Running the final version of your program will take some time (I timed it at 11m 21s seconds on the server). You should reduce the evaluation time by following the advice in the comment. This will make your code-test-evaluate feedback loop much faster. Here is an (abbreviated) output for the testing values: ~~~text | ----------- Heap Sort ----------- | ----------- Quick Sort ------------- | --- Random ---- | ---- LRU ------ | ---- Random ---- | ----- LRU ------- N | M | A | R | M | A | R | M | A | R | M | A | R ---|----|-----|------|----|-----|------|----|------|------|----|------|------ 2 | 0 | 10 | 0.00 | 0 | 10 | 0.00 | 0 | 11 | 0.00 | 0 | 11 | 0.00 4 | 0 | 44 | 0.00 | 0 | 44 | 0.00 | 0 | 42 | 0.00 | 0 | 42 | 0.00 8 | 0 | 142 | 0.00 | 0 | 142 | 0.00 | 0 | 139 | 0.00 | 0 | 139 | 0.00 16 | 0 | 456 | 0.00 | 0 | 456 | 0.00 | 0 | 492 | 0.00 | 0 | 492 | 0.00 32 | 58 | 866 | 0.07 | 50 | 890 | 0.06 | 25 | 1293 | 0.02 | 23 | 1563 | 0.01 ~~~ It takes less than one second to produce this output. # Drawing Your first gradescope prompt is to draw a picture. Rather than diving straight into typing away at the keyboard, I want you to draw a diagram for the emulation. Your diagram should include the following: - the `physical_memory` variable - the `page_table` variable - the `files` in the directory in which you run your program - and how these different entities relate to one another At this point, you should answer the first four prompts on gradescope (two drawings and two questions about the output table). # Task 1: Address Translation Your first coding task is to implement the function `translate_addr`. This function should take a virtual address and return the corresponding physical address (if the page is in physical memory) or `-1` if the page is not in physical memory. Note that in this emulation, the virtual address is the offset from the beginning of the integer array (so the virtual address of `array[i]` is `i * 4` since the array contains four-byte integers and physical memory is an array of bytes) and the physical address is the offset from the beginning of the byte array `physical_memory`. **Hint:** If you test your code now, things will break badly when you try out the `LRU` strategy. You can comment it out in `evaluate` to just test `RANDOM`... but don't forget to uncomment it later! # Task 2: LRU Eviction The starter code chooses a **random** frame to evict from physical memory; as you might imagine, this is not always an efficient eviction algorithm. Your second task is to implement a **least recently used (LRU)** eviction policy, in which the page that has been accessed least recently will be evicted when a page fault occurs. Here is what you'll need to do: 1. Add an `lru_marker` to the `page_table_entry` struct 2. Add a global variable (`use_marker`) for keeping track of the current usage marker 3. Make sure that `lru_marker` is initialized inside the `initialize` function 4. Add code to your `translate_addr` function that updates a page's `lru_marker` when it is accessed (this should also update `use_marker`) 5. Initialize `use_marker` as `0` in `main` 6. Implement the LRU page eviction strategy in `handle_page_fault` # Evaluation Once you have implemented and tested your address translation and page eviction code, modify the main function to call the function evaluate (if you haven't made any changes to the main function, it should do this already). Double check that the constants at the top are set correctly before running this code! Copy and paste your output into gradescope. **Hint:** You might also want to try running the evaluate function with various other possible values for the global constants at the top, but make sure the ones you include in your submission are the right ones. # Submitting Your Assignment You will submit your code and/or responses on gradescope. **Only one partner should submit.** The submitter will add the other partner through the gradescope interface. To pass the autograder (if one exists for this assignment), your output must exactly match the expected output. Your program output should be similar to the example execution above, and the autograder on gradescope will show you the correct output if yours is not formatted properly. You can use [text-compare](https://text-compare.com/) to compare your output to the expected output and that should give you an idea if you have a misspelled word or extra space (or if I do). Additional details for using gradescope can be found here: - [Submitting an Assignment](https://help.gradescope.com/article/ccbpppziu9-student-submit-work) - [Adding Group Members](https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members) - [gradescope Student Help Center](https://help.gradescope.com/category/cyk4ij2dwi-student-workflow)