Computer Science Questions: Process Synchronization-Scheduling/Threads

profileMalikbinn
VirtualMemory.pdf

Virtual Memory

Virtual Memory

• Decouples logical (process) memory from physical memory: – Enables execution of processes that are not completely in

memory

– Allows programs to be larger than physical memory • Processes see their own large contiguous virtual address space • On a 32-bit system each process has 4GB of virtual address space, even if

physical RAM is much smaller

– Frees the programmer from concerns of memory-storage limitations

– Memory (pages) are sharable with other processes – Makes process creation more efficient – Very challenging to implement efficiently

• Implemented using demand paging

Example: Virtual Memory > Physical Memory

Processes Sharing Libraries

Demand Paging • Problem: When should pages stored in secondary

storage be loaded into memory?

Demand Paging • Solution: Demand paging loads pages into physical memory

as they are needed • When a page is referenced: – If not in memory, then load the page into memory from the

backing store (e.g. disk) • Benefits: – Reduces physical memory consumption – Faster response times – Can support more users

• Uses the concept of a lazy pager – Never swaps a page into memory unless page will be needed

Lazy Paging vs. Lazy Swapper • Note: a lazy swapper is different from a lazy pager: – Lazy pager swaps individual pages of a process – Lazy swapper swaps all pages of a processes

Valid-Invalid Bit • Recall that each page table entry has a valid-invalid bit • Address translation: – Valid-invalid bit is v: the frame is present in main memory. – Valid-invalid bit is i: the frame is not in main memory ⇒

load it from the disk. This is called page fault.

4 v0 i1

6 v2 i3 i4

9 v5 i6 i7

Page table

frame valid-invalid bit

Virtual pages 0, 2, and 5, resolve to frames 4, 6, and 9 respectively, all of which are resident in main memory (i.e. their valid- invalid bit is v)

All other virtual pages will trigger a page fault (because their valid-invalid bit is i).

Example: Page Table with Pages Stored to Disk

Demand Paging: Page Faults If referenced page is not in memory, then generate a page fault 1. OS validates the reference: – If invalid (i.e. page was never mapped to a physical frame),

then abort. – If valid, but not in memory, then continue to step 2.

2. Allocate a frame 3. Load the page from the disk into the frame 4. Update the page table to indicate that the page is now in

memory 5. Set valid-invalid bit of the page to v. 6. Restart the instruction that caused the page fault.

Example: Page Faults

Optimizing Process Creation • Virtual memory can be used to optimize

process creation: – Copy-on-write – Memory mapped files

Copy-on-Write (COW) • Enables parent and child processes to initially share the same

pages • If either process modifies a page: – Target page is copied – Copy assigned to the process that attempted the modification.

• Benefit: faster process creation since only modified pages are copied

• Example: fork() system call does not automatically copy the entire memory space of the parent; only copies pages as they change.

COWS Example: Page C Before Modification by Process1

COWS Example: Page C After Modification by Process1

Page Replacement • Problem: Main memory is a finite resource. What

happens when all frames are in use?

Page Replacement • Solution: If no free frames are available, evict another

in-memory page by saving it to disk • Challenge: which page should we evict?

Page Replacement Algorithms • Page replacement algorithm is trigged during page fault

to decides what pages should be evicted from memory to make room for the incoming page

• Goal: minimize the number of page faults – Avoid evicting an often used page that will need to be brought

back into memory soon

Page Replacement Algorithms • Optimal Page Replacement: – Replace page needed at the farthest point in future – Optimal but unrealizable (Requires future knowledge!) – Example: 3 frames; all pages are originally not in memory

(i.e. need to be brought in from the disk).

9 page faults

Page Replacement Algorithms: First In First Out

• Linked list of pages (oldest at head) • Remove pages from the head • Example: system with 3 frames; all pages are originally not in

memory (i.e. need to be brought in from the disk).

– 15 page faults

• Simple to implement, but suffers from Belady’s anomaly: – When increasing the number of frames allocated to the process

actually increases the number of page faults

More Page Replacement Algorithms • Least Recently Used (LRU):

– Associate time stamp with each page access – Replace page that was not recently used

• Second Chance: – Like FIFO but includes a reference bit that is set on page access – If referenced, unset bit, move to the head, and continue search – Not efficient

• Clock: – Like Second Chance, but uses a circular list – Don’t move pages just advance the hand

• Least Frequently Used (LFU): – Replace the page that was least recently used

• Most Frequently Used (NFU): – Replace the page that is not used frequently

• Aging, Working Set, WSClock…many more!

Local vs Global Replacement • Local page replacement: on page fault, evict a

page of the faulting process. • Global replacement: when choosing a page to

evict, consider all pages in memory.

Local vs Global Replacement • Example: consider an LRU algorithm and process A, B, and C (the lesser

the age the least recently the corresponding page was used). Process A causes a page fault.

Original configuration

Local page replacement

Global page replacement

Cleaning Policy • Problem: When in memory pages are modified,

additional work is needed to swap them back out to disk. How can this be handled efficiently?

Cleaning Policy • Unchanged pages in memory are easily evicted

– Assumes copy of page already exists on disk – Simply load up new copy from disk as needed

• Dirty pages need to be updated on disk before being overwritten – A dirty page is a page that has changed since it was brought into

memory • Cleaning Policy determines when dirty pages are updated on

disk so they can be evicted from memory

Demand Cleaning • Dirty page is saved to the disk only when it’s frame has

been selected for eviction • However, a process that triggers a page fault may have to

wait for 2 page transfers: – 1st to write out a dirty page – 2nd to load the new page

Pre-cleaning • Dirty pages saved before their frames are selected for eviction

– OS periodically saves out dirty pages in batches – More efficient than saving each page individually – Inefficient when many saved pages modified again before they are

selected for eviction

Thrashing • Process is stalled because pages constantly swapping in & out • Occurs when the process does not have enough frames • Consequences: – Low CPU utilization – Since processes spend the majority of their time paging, no

useful work is getting done – May trick the OS into trying to improve the CPU utilization by

adding more processes to the system - making the problem worse!

Thrashing • Degree of multiprogramming vs CPU utilization: – After multiprogramming is increased beyond a certain point,

thrashing sets in.

Controlling Thrashing • We can control thrashing by using local replacement

algorithm: – If one process starts thrashing, it cannot steal frames from

another process thus causing it to thrash. – Limitation: thrashing process will be in the paging device queue

most of the time: • Increases the size of the queue and hence service time of non-

thrashing processes using the device.

• Preventing thrashing: give the process all the frames it needs. – Estimate the # of such frames using the working set model

algorithm

Working Set Model Algorithm • Key idea: we must allocate enough frames to a

process in order to accommodate current locality: – Process will fault for pages in the current locality and will

not fault again until the locality changes. – Otherwise the process will thrash!

page reference table . . . 2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4 4 4 3 4 4 4 . . .

t1 WS(t1) = {1,2,5,6,7}

t2 WS(t2) = {3,4}

Working Set Page Fault Cycle

1

0 time

working set

page fault rate

Page Fault Frequency (PFF) • Key idea: define an acceptable page fault rate: – Too high: process needs more frames – Too low: processes can afford to loose a frame • Example:

Paging Locality of Reference • Locality of reference: frequently accessing the same

memory locations – Spatial locality: when a memory location is accessed,

neighboring locations are likely to be accessed – Temporal locality: when memory location X is

accessed, it will be accessed again soon

Paging Locality of Reference (Cont.) • Programming practice can affect paging performance • Use of pointers and object oriented programming

tends to minimize locality of reference • Example: assume 128 word pages. – array large_record_type data [128,128]; – Each row is stored in one page (i.e. using row major) – Each page is not resident in memory (i.e. must be faulted in)

//Program 1 for (j = 0; j <128; j++)

for (i = 0; i < 128; i++) data[i,j] = init_record;

//128 * 128 = 16,384 page faults

//Program 2 for (i = 0; i < 128; i++)

for (j = 0; j < 128; j++) data[i,j] = init_record;

//128 page faults

Memory-Mapped Files • Allows virtual memory to be associated with a disk file

– A file is initially read in using demand paging – Subsequent reads/writes to/from the file are treated as ordinary

memory accesses • Linux provides mmap() system call for mapping a file into

memory

Memory-Mapped Files Advantages • Significantly improves performance of file I/O operations • Easier than accessing file using traditional system calls

– open(), read(), and write() • Multiple processes may map the same file

– Allows pages in memory to be shared – Used for sharing the code contained in dynamically linked

libraries

Memory-Mapped Files

Sharing Memory-mapped File Pages

Memory-Mapped I/O • Enables efficient access I/O devices: – Reserves certain memory addresses for use by I/O devices – Communicating with the device is a simple as

reading/writing to the reserved memory region • Necessary for devices with fast response times – Example: video hardware

• Alternative is to use special CPU instructions to move data between system memory and I/O devices – Very slow

Kernel Memory • Different from the user memory. • Often allocated from a memory pool • May have to be physically contiguous • Approaches: • \

– Buddy system – Slab allocation

Credits Some ideas are borrowed from Dr. Kartik Gopalan of Binghamton University and Dr. Mariko Molodowitch.