cs357 Lecture Notes Spring 2000 Week 12, Tuesday For today you should have read Chapters 8 and 9 with emphasis on the parts that are relevant to the current assignment. You should be working on the current assignment with your partner. For next time you should finish the current assignment. Also if you are working on the optional quiz, it is due on Thursday. Memory management ----------------- Physical memory systems old PC OS, some current Crays 1) Contiguous allocation 2) Loading and linking 3) Swapping Virtual memory systems just about everything else 1) Frame allocation (page replacement policy) 2) Loading becomes trivial 3) Swapping is subsumed by page replacement 4) Address translation Virtual memory systems ---------------------- No need for contiguous allocation, so external fragmentation goes away, but... fixed size pages => internal fragmentation Loading becomes trivial, since all (virtual) addresses can be resolved at compile time. We know at compile time exactly where in VM the program will reside, and we don't care where in PM. Address translation (review) ------------------- CPU generates virtual (logical) addresses. MMU translates to physical addresses and presents translated address to the memory system. VA broken into a page number and offset within page. (Example: 8-bit virtual address space) Page size --------- Page size determines how many bits are in the offset. 1K = 10 bits 4K = 12 bits How big is the page table for a 32-bit address space? 1K = 22 bits of page number = 2^22 pages = 4 million 4K = 20 bits of page number = 2^20 pages = 1 million Bigger pages -> smaller page tables Bigger pages -> more internal fragmentation (on average 1/2 page per segment per process) Page table representation ------------------------- Even 1 million entries is too much. Assuming each entry is 4 bytes, that's 4 MB just for the page table! Fortunately, most processes don't use most of memory most of the time. Solution: two level page table Break VAS into 2^10 books. Each book contains 2^10 pages. Each page contains 2^12 bytes. Level one: find the right book 2^10 4-byte entries = 4K Level two: find the right page 2^10 4-byte entries = 4K The book table fits on a page! One part of the page table fits on a page! Convenient to put page tables in physical memory. (or maybe even in virtual memory) Most books are completely empty. Probably just two books for most processes; total of 3 pages for the page table! (How big does the stack have to be to need two books?) Access time ----------- Every memory access (cache miss) requires: 1) read an entry from the book table (in memory) 2) read an entry from the page table (in memory) 3) read the value from memory We just tripled the memory access time! Aaarg! Fortunately, we have our old friend locality. Cache address translations!!! Address translation cache = translation lookaside buffer (TLB) TLB entry tag = 20 bit book number/page number data = physical page number TLB usually fully associative, to minimize contention, maximize hit rate, because hit rate is going to be very important for performance. (See the problem from the book). TLB behavior ------------ How many entries will a typical process have? 1 MB of allocated memory = 256 4K pages How many bits to name a physical page? 128 MB = 15 bits of physical page name How big is a TLB entry? 20 bits + 15 bits + flags = roughly 40 bits = 5 bytes How much TLB space per process? 5 bytes * 256 pages = 1-2 KB What happens when we switch processes? Maybe have to flush the TLB! Can we avoid that? Paging ------ Do all of a process's pages have to be in memory for the process to run? Nope. Page table entry has two possible results. Either: 1) a physical address, in which case address translation proceeds and the process continues running OR 2) a location on disk, in which case a) the process blocks b) the OS finds a free frame c) the OS issues a disk read d) when the read completes, the OS updates the page table entry e) the process resumes and reissues the instruction This process is called a page fault. Demand paging ------------- Like lazy swapping. Instead of moving all of a process's pages back into memory, we leave them on disk until the first time they are needed. Con: page faults take a long time (8 ms); screw up interactive response Pro: many pages never loaded! e.g. rare error-handling code (Implications for code generation?) Page replacement ---------------- What if a page fault occurs and there are no physical pages available? We have to kick someone out! b1) OS selects a frame b2) OS issues a disk write b3) when write completes, OS resumes page fault handler Optimizations 1) keep at least one free page so you can overlap writing the old page and reading the new one 2) if the victim page has not been modified (and is already on disk), you don't have to write it Page replacement algorithms ---------------------------