Page Replacement

If "free frames" use the first (where do they come from?) If no free frames, must select victim, then replace it. Not if we select a victim which has not been changed, then do not need to write it out.

Hmmmm. What about changing hardware to add 1 bit to each frame that is originally 0 (when page is first loaded), but is changed (by hw) with any write to that frame.

How to select victim?

FIFO (i.e., replace oldest page)
LRU (i.e., replace least recently used page)
Optimal (i.e., replace page that will not be used for longest period of time.
How to implement covered later.

To analyze memory reference behavior (for design and understanding) take trace of "typical" executing program to get address references. Translate to page numbers. Generate sequence of page number changes (since two consecutive references to same page will not generate page fault).


      Consider the sequence:
        1 2 3 4 1 2 5 1 2 3 4 5

      For FIFO
        if 1 frame then 12 faults.

      2 frames
        * * * * * * * * * * * *         still 12 faults
        1 1 2 3 4 1 2 5 1 2 3 4
          2 3 4 1 2 5 1 2 3 4 5

      3 frames
        * * * * * * *     * *           9 faults
        1 1 1 2 3 4 1 1 1 2 5 5
          2 2 3 4 1 2 2 2 5 3 3
            3 4 1 2 5 5 5 3 4 4

      4 frames
        * * * *     * * * * * *         10 faults
        1 1 1 1 1 1 2 3 4 5 1 2
          2 2 2 2 2 3 4 5 1 2 3
            3 3 3 3 4 5 1 2 3 4
              4 4 4 5 1 2 3 4 5
(would be good to redo above example for LRU)

How to implement:

Approximations for LRU -- Reference bit per page set by HW with each reference. Periodically (say every 100 millisecs) copy all reference bits shifted onto, say, 8-bit string per page. Then clear reference bit. Thus:

              11111111   means this page ref at least once each of last 8 periods.
              10000000   means?
              00101011   means?
If these bit strings are treated as integers, then smallest are (approx.) LRU.

Second Chance Algorithm -- Memory as circular list.

Seach (circular) for first frame with reference bit not set, clear as search is made. Worst case search is complete cycle and choose original.

Bits: if have ref and dirty bit (r,d), natural priority:

(0,0) not used, clean
(0,1) not used, dirty
(1,0) used, clean
(1,1) used, dirty

Other approaches:

Keep pool of free frames, allocate immediately from pool, then move victim to pool. Copy victim out. Thus seldom have to wait for both read and write.

Also keep page ids of frames in pool, then may not need to reread page.

Mininum number of frames -- If one instruction can reference three pages (for example) what would happen if process is only allocated 2 frames? (On OS/360, MVC, a 6 byte instruction, can require 6 pages: instruction can be on page boundary, each address may also.)

How to Allocate?

  1. Demand paging: never bring page in until referenced.
  2. Global or local allocation:
Which is better? Fairer?
CPU utilization better with global. Process speed more repeatable with local. With local, what should be the size of frame allocation? (#avail frames/#processes?)

Working Sets -- (Not a precise definition) The working set of a process is the minimal number of pages to keep page faults low. (also defined as the number of different page references in last delta T time)

If WSS(i) is size of working set for process i, then D (demand) = summation of WSS(i), where n is number of active processes.
If D << real memory, start a new process.
If D >> real memory, thrashing will occur. Swap a process out, to release its frames.

Prepaging: if know WS, then bring all into main memory. (How could OS determine WS?)

Misc Virtual Memory topics:

Array reference order

Consider:

             VAR a[ 1..10000, 1..10000 ] OF char;
             FOR i = 1 TO 10000 LOOP
               FOR j = 1 TO 10000 LOOP
                 a[i,j] = ' ';
                   vs
                 a[j,i] = ' ';
what's the difference? (row major vs column major order)

Best page size

Small page
larger page tables (wasted memory)
more page faults (why?)
less internal fragmentation

Large page
smaller page tables
fewer page faults (why?)
more internal fragmentation
less page fault overhead (due to long latency & fast transfers)
Trend: bigger pages.

Real Systems

All UNIX systems (that I know about) use virtual memory.

Locked Out Pages

Inverted page tables

Note that page tables are big if many processes are active. If size of virtual memory >> size of physical memory, could save a lot of space if only one page table for entire system (size of number frames of real memory). Then memory reference works:

                                                      +---+
        +---+ log.                                    |   |
        |CPU| ----> (pid, p, d ) --------> (i, d)---->|   |
        +---+ addr    |                     ^         |   |
                      |   +------+ \        |         |   |
                      |   |      |  |       |         |   |
                       -->|      |  > i ----          |   |
            (search       |      |  |                 |   |
             page tbl     |pid,p | /                  |   |
             for          |      |                    |   |
             pid,p>       |      |                    |   |
                          +------+                    |   |
                         page table                   |   |
                                                      +---+
If page not in memory, must then search process's real page table to find out where (on disk) it is. To speed things up, this is done in associative registers (no real search of page table).


Index Previous Next

Copyright ©2017, G. Hill Price
Send comments to G. Hill Price