**Assignment 08: Virtual Memory** # Learning Goals - Understand how virtual memory is implement by an operating system. # Grading Walk-Throughs This assignment will be graded as pass/needs-revisions by a TA. To pass the assignment, you must 1. Complete the assignment and submit your work to gradescope. - You should start this assignment in class on the day shown on the calendar. - **Complete the assignment as early as possible**. 2. Schedule a time to meet with a TA prior to the deadline. - You must book a time to meet with a TA - Sign-up on the Google Sheet **with at least 36 hours of notice**. - Contact your TA on Slack after signing-up. - All partners must meet with the TA. If you can't all make it at the same time, then each of you needs to schedule a time to meet with the TAs. 3. Walk the TA through your solutions prior to the deadline. - Walk-throughs should take no more than 20 minutes. - You should be well prepared to walk a TA through your answers. - You may not make any significant corrections during the walk-through. You should plan on making corrections afterward and scheduling a new walk-through time. Mistakes are expected--nobody is perfect. - You must be prepared to explain your answers and justify your assumptions. TAs do not need to lead you to the correct answer during a walk-through--this is best left to a mentor session. 4. The TA will then either - mark your assignment as "pass" on gradescope, or - mark your assignment as "needs-revisions" and inform you that you have some corrections to make. 5. If corrections are needed, then you will need to complete them and then schedule a new time to meet with the TA. - You will ideally complete any needed revisions by the end of the day the following Monday If you have concerns about the grading walk-through, you can meet with me after you've first met with a TA. # Overview For this assignment, you will emulate virtual memory at the user level. As usual, you should complete this assignment with a partner; you may choose your partner for this assignment. The starter code for this assignment is available on the course VM, and you should complete the assignment on the course VM. You can copy and unpack the tar file with: ~~~bash tar xvf /data/A08-VirtualMemory.tar ~~~ Running this command will create (in your current directory) a subdirectory named `A08-VirtualMemory` containing a `Makefile` and the starter code `vm.c`. # Background Recall that virtual memory allows a process to use more memory than is available on the physical machine. Virtual memory is divided into pages, and some pages are stored in frames in physical memory while others are written out to the disk. A page fault occurs when a process attempts to access a value in "memory" that is not actually in physical memory. The operating system kernel handles page faults by 1. identifying the missing page, 2. evicting some other page from physical memory, 3. reading the missing page from disk and storing it in the newly available physical memory frame, 4. updating the process's page table, and 5. completing the instruction responsible for the page fault. # Starter Code The starter code contains most of what is needed to simulate virtual memory. Physical memory is implemented as an array of bytes; the address of this array is stored in the global variable `byte *physical_memory`. The *physical address* of a value is the index of (the start of) that value in the array `physical_memory`. In the simulation, the page table (`page_table_entry_t *page_table`) is stored on the heap as an array of `page_table_entry_t` values (page table entries). The type `page_table_entry_t` is defined at the top of the file; it contains a frame number and a valid bit (one of your tasks is to add a member variable for storing a usage marker). For simplicity, there are no access control bits. If the valid bit is set, then that virtual page is stored in physical memory (i.e., in the array `physical_memory`). If the valid bit is not set, then that virtual page is currently paged out to disk; in this case that virtual page is stored in a file located in the directory `files` whose filename is the page number. For simplicity, you may assume that there are no uninitialized pages in this simulation. We will assume that the only thing stored in virtual memory is an array of random integers (we will ignore real virtual memory also includes code, global variables, stack frames, etc.). This random array of integers is generated by the function `gen_array` and stored in our simulated virtual memory system by the function `initialize`. The function `handle_page_fault` simulates the kernel code for handling page faults. It 1. identifies the missing page, 2. evicts some other page from memory, 3. reads the missing page from disk, and 4. updates the simulated page table. The wrapper functions `load` and `store` simulate memory accesses that call the page fault handler code when necessary. These two functions are analogous to accessing a piece of array data using brackets (i.e., `array[45]` is similar to `load(virtual_address)`). The function `print_simulation_state` prints the current state of the page table and of virtual memory. This might be helpful for debugging purposes. However, I strongly recommend that you solve this assignment on a system where you can run gdb. The starter code includes functions that implement heap sort and quicksort; these functions are used to evaluate the simulation. The function `print_array` prints the current state of the array (that you are in the process of sorting). **Hint:** you might find this function useful for testing your solution, as most bugs will result in an incorrectly sorted array, or a segfault. One more thing! You'll notice this code at the top of your file: ~~~c // Global constants defining VM characteristics // I recommend testing with // const int PAGE_SIZE = 16; // const int PHYSICAL_MEMORY_SIZE = 64; // const int VIRTUAL_MEMORY_SIZE = 128; const int PAGE_SIZE = 4096; // 4 KB = 2^12 bytes const int PHYSICAL_MEMORY_SIZE = 32768; // 32 KB = 2^15 bytes const int VIRTUAL_MEMORY_SIZE = 1048576; // 1 MB = 2^20 bytes ~~~ Running the final version of your program will take some time (I timed it at 11m 21s seconds on the server). You should reduce the scale of the simulation by following the advice in the comment. This will make your feedback loop much faster. Here is an (abbreviated) output for the example values: ~~~text | ----------- Heap Sort ----------- | ----------- Quick Sort ------------- | --- Random ---- | ---- LRU ------ | ---- Random ---- | ----- LRU ------- N | M | A | R | M | A | R | M | A | R | M | A | R ---|----|-----|------|----|-----|------|----|------|------|----|------|------ 2 | 0 | 10 | 0.00 | 0 | 10 | 0.00 | 0 | 11 | 0.00 | 0 | 11 | 0.00 4 | 0 | 44 | 0.00 | 0 | 44 | 0.00 | 0 | 42 | 0.00 | 0 | 42 | 0.00 8 | 0 | 142 | 0.00 | 0 | 142 | 0.00 | 0 | 139 | 0.00 | 0 | 139 | 0.00 16 | 0 | 456 | 0.00 | 0 | 456 | 0.00 | 0 | 492 | 0.00 | 0 | 492 | 0.00 32 | 58 | 866 | 0.07 | 50 | 890 | 0.06 | 25 | 1293 | 0.02 | 23 | 1563 | 0.01 ~~~ It takes less than one second to produce this output. # Drawing Your first task is to draw a picture. Rather than diving straight into typing away at the keyboard, I want you to draw a diagram for the simulation. Your diagram should include the following: - the `physical_memory` variable - the `page_table` variable - the `files` in the directory in which you run your program - and how these different entities relate to one another At this point, you should answer the first three questions on gradescope (two drawings and one question about the output table). # Address Translation Your first task is to implement the function `translate_addr`. This function should take a virtual address and return the corresponding physical address (if the page is in physical memory) or `-1` if the page is not in physical memory. Note that in this simulation, the virtual address is the offset from the beginning of the integer array (so the virtual address of `array[i]` is `i * 4`) and the physical address is the offset from the beginning of the byte array `physical_memory`. **Hint:** If you test your code now, things will break badly when you try out the `LRU` strategy. You can comment it out in `evaluate` to just test out `RANDOM`... but don't forget to uncomment it later! # Eviction The starter code chooses a **random** frame to evict from physical memory; as you might imagine, this is not always an efficient eviction algorithm. Your second task is to implement a **least recently used (LRU)** eviction policy, in which the page that has been accessed least recently will be evicted when a page fault occurs. Here is what you'll need to do: 1. Add an `lru_marker` to the `page_table_entry` struct 2. Add a global variable (`use_marker`) for keeping track of the current usage marker 3. Make sure that `lru_marker` is initialized inside the `initialize` function 4. Initialize `use_marker` as `0` in `main` 5. Implement the LRU page eviction strategy in `handle_page_fault` # Evaluation Once you have implemented and tested your address translation and page eviction code, modify the main function to call the function evaluate (if you haven't made any changes to the main function, it should do this already). Double check that the constants at the top are set correctly before running this code! Copy and paste your output into gradescope. **Hint:** You might also want to try running the evaluate function with various other possible values for the global constants at the top, but make sure the ones you include in your submission are the right ones. # Submitting Your Assignment You will submit your code and/or responses on gradescope. **Only one partner should submit.** The submitter will add the other partner through the gradescope interface. To pass the autograder (if one exists for this assignment), your output must exactly match the expected output. Your program output should be similar to the example execution above, and the autograder on gradescope will show you the correct output if yours is not formatted properly. You can use [text-compare](https://text-compare.com/) to compare your output to the expected output and that should give you an idea if you have a misspelled word or extra space (or if I do). Additional details for using gradescope can be found here: - [Submitting an Assignment](https://help.gradescope.com/article/ccbpppziu9-student-submit-work) - [Adding Group Members](https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members) - [gradescope Student Help Center](https://help.gradescope.com/category/cyk4ij2dwi-student-workflow)