Computer Science Questions: Process Synchronization-Scheduling/Threads
Virtual Memory
Virtual Memory
• Decouples logical (process) memory from physical memory: – Enables execution of processes that are not completely in
memory
– Allows programs to be larger than physical memory • Processes see their own large contiguous virtual address space • On a 32-bit system each process has 4GB of virtual address space, even if
physical RAM is much smaller
– Frees the programmer from concerns of memory-storage limitations
– Memory (pages) are sharable with other processes – Makes process creation more efficient – Very challenging to implement efficiently
• Implemented using demand paging
Example: Virtual Memory > Physical Memory
Processes Sharing Libraries
Demand Paging • Problem: When should pages stored in secondary
storage be loaded into memory?
Demand Paging • Solution: Demand paging loads pages into physical memory
as they are needed • When a page is referenced: – If not in memory, then load the page into memory from the
backing store (e.g. disk) • Benefits: – Reduces physical memory consumption – Faster response times – Can support more users
• Uses the concept of a lazy pager – Never swaps a page into memory unless page will be needed
Lazy Paging vs. Lazy Swapper • Note: a lazy swapper is different from a lazy pager: – Lazy pager swaps individual pages of a process – Lazy swapper swaps all pages of a processes
Valid-Invalid Bit • Recall that each page table entry has a valid-invalid bit • Address translation: – Valid-invalid bit is v: the frame is present in main memory. – Valid-invalid bit is i: the frame is not in main memory ⇒
load it from the disk. This is called page fault.
4 v0 i1
6 v2 i3 i4
9 v5 i6 i7
Page table
frame valid-invalid bit
Virtual pages 0, 2, and 5, resolve to frames 4, 6, and 9 respectively, all of which are resident in main memory (i.e. their valid- invalid bit is v)
All other virtual pages will trigger a page fault (because their valid-invalid bit is i).
Example: Page Table with Pages Stored to Disk
Demand Paging: Page Faults If referenced page is not in memory, then generate a page fault 1. OS validates the reference: – If invalid (i.e. page was never mapped to a physical frame),
then abort. – If valid, but not in memory, then continue to step 2.
2. Allocate a frame 3. Load the page from the disk into the frame 4. Update the page table to indicate that the page is now in
memory 5. Set valid-invalid bit of the page to v. 6. Restart the instruction that caused the page fault.
Example: Page Faults
Optimizing Process Creation • Virtual memory can be used to optimize
process creation: – Copy-on-write – Memory mapped files
Copy-on-Write (COW) • Enables parent and child processes to initially share the same
pages • If either process modifies a page: – Target page is copied – Copy assigned to the process that attempted the modification.
• Benefit: faster process creation since only modified pages are copied
• Example: fork() system call does not automatically copy the entire memory space of the parent; only copies pages as they change.
COWS Example: Page C Before Modification by Process1
COWS Example: Page C After Modification by Process1
Page Replacement • Problem: Main memory is a finite resource. What
happens when all frames are in use?
Page Replacement • Solution: If no free frames are available, evict another
in-memory page by saving it to disk • Challenge: which page should we evict?
Page Replacement Algorithms • Page replacement algorithm is trigged during page fault
to decides what pages should be evicted from memory to make room for the incoming page
• Goal: minimize the number of page faults – Avoid evicting an often used page that will need to be brought
back into memory soon
Page Replacement Algorithms • Optimal Page Replacement: – Replace page needed at the farthest point in future – Optimal but unrealizable (Requires future knowledge!) – Example: 3 frames; all pages are originally not in memory
(i.e. need to be brought in from the disk).
9 page faults
Page Replacement Algorithms: First In First Out
• Linked list of pages (oldest at head) • Remove pages from the head • Example: system with 3 frames; all pages are originally not in
memory (i.e. need to be brought in from the disk).
– 15 page faults
• Simple to implement, but suffers from Belady’s anomaly: – When increasing the number of frames allocated to the process
actually increases the number of page faults
More Page Replacement Algorithms • Least Recently Used (LRU):
– Associate time stamp with each page access – Replace page that was not recently used
• Second Chance: – Like FIFO but includes a reference bit that is set on page access – If referenced, unset bit, move to the head, and continue search – Not efficient
• Clock: – Like Second Chance, but uses a circular list – Don’t move pages just advance the hand
• Least Frequently Used (LFU): – Replace the page that was least recently used
• Most Frequently Used (NFU): – Replace the page that is not used frequently
• Aging, Working Set, WSClock…many more!
Local vs Global Replacement • Local page replacement: on page fault, evict a
page of the faulting process. • Global replacement: when choosing a page to
evict, consider all pages in memory.
Local vs Global Replacement • Example: consider an LRU algorithm and process A, B, and C (the lesser
the age the least recently the corresponding page was used). Process A causes a page fault.
Original configuration
Local page replacement
Global page replacement
Cleaning Policy • Problem: When in memory pages are modified,
additional work is needed to swap them back out to disk. How can this be handled efficiently?
Cleaning Policy • Unchanged pages in memory are easily evicted
– Assumes copy of page already exists on disk – Simply load up new copy from disk as needed
• Dirty pages need to be updated on disk before being overwritten – A dirty page is a page that has changed since it was brought into
memory • Cleaning Policy determines when dirty pages are updated on
disk so they can be evicted from memory
Demand Cleaning • Dirty page is saved to the disk only when it’s frame has
been selected for eviction • However, a process that triggers a page fault may have to
wait for 2 page transfers: – 1st to write out a dirty page – 2nd to load the new page
Pre-cleaning • Dirty pages saved before their frames are selected for eviction
– OS periodically saves out dirty pages in batches – More efficient than saving each page individually – Inefficient when many saved pages modified again before they are
selected for eviction
Thrashing • Process is stalled because pages constantly swapping in & out • Occurs when the process does not have enough frames • Consequences: – Low CPU utilization – Since processes spend the majority of their time paging, no
useful work is getting done – May trick the OS into trying to improve the CPU utilization by
adding more processes to the system - making the problem worse!
Thrashing • Degree of multiprogramming vs CPU utilization: – After multiprogramming is increased beyond a certain point,
thrashing sets in.
Controlling Thrashing • We can control thrashing by using local replacement
algorithm: – If one process starts thrashing, it cannot steal frames from
another process thus causing it to thrash. – Limitation: thrashing process will be in the paging device queue
most of the time: • Increases the size of the queue and hence service time of non-
thrashing processes using the device.
• Preventing thrashing: give the process all the frames it needs. – Estimate the # of such frames using the working set model
algorithm
Working Set Model Algorithm • Key idea: we must allocate enough frames to a
process in order to accommodate current locality: – Process will fault for pages in the current locality and will
not fault again until the locality changes. – Otherwise the process will thrash!
page reference table . . . 2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4 4 4 3 4 4 4 . . .
∆
t1 WS(t1) = {1,2,5,6,7}
∆
t2 WS(t2) = {3,4}
Working Set Page Fault Cycle
1
0 time
working set
page fault rate
Page Fault Frequency (PFF) • Key idea: define an acceptable page fault rate: – Too high: process needs more frames – Too low: processes can afford to loose a frame • Example:
Paging Locality of Reference • Locality of reference: frequently accessing the same
memory locations – Spatial locality: when a memory location is accessed,
neighboring locations are likely to be accessed – Temporal locality: when memory location X is
accessed, it will be accessed again soon
Paging Locality of Reference (Cont.) • Programming practice can affect paging performance • Use of pointers and object oriented programming
tends to minimize locality of reference • Example: assume 128 word pages. – array large_record_type data [128,128]; – Each row is stored in one page (i.e. using row major) – Each page is not resident in memory (i.e. must be faulted in)
//Program 1 for (j = 0; j <128; j++)
for (i = 0; i < 128; i++) data[i,j] = init_record;
//128 * 128 = 16,384 page faults
//Program 2 for (i = 0; i < 128; i++)
for (j = 0; j < 128; j++) data[i,j] = init_record;
//128 page faults
Memory-Mapped Files • Allows virtual memory to be associated with a disk file
– A file is initially read in using demand paging – Subsequent reads/writes to/from the file are treated as ordinary
memory accesses • Linux provides mmap() system call for mapping a file into
memory
Memory-Mapped Files Advantages • Significantly improves performance of file I/O operations • Easier than accessing file using traditional system calls
– open(), read(), and write() • Multiple processes may map the same file
– Allows pages in memory to be shared – Used for sharing the code contained in dynamically linked
libraries
Memory-Mapped Files
Sharing Memory-mapped File Pages
Memory-Mapped I/O • Enables efficient access I/O devices: – Reserves certain memory addresses for use by I/O devices – Communicating with the device is a simple as
reading/writing to the reserved memory region • Necessary for devices with fast response times – Example: video hardware
• Alternative is to use special CPU instructions to move data between system memory and I/O devices – Very slow
Kernel Memory • Different from the user memory. • Often allocated from a memory pool • May have to be physically contiguous • Approaches: • \
– Buddy system – Slab allocation
Credits Some ideas are borrowed from Dr. Kartik Gopalan of Binghamton University and Dr. Mariko Molodowitch.