CS 471 -- Lecture 11

Memory Management

How to manage memory. In the past, this has been the key resource (only after the CPU). Effective solution requires both hardware and software support. Key issues: speed and flexibility.

Much of history of memory management is attempts to cover problem of too little memory. Traditionally (i.e., a long time ago) process could not start execution until complete binary image was available in main memory.

Alternatives:

Dynamic load: don't load routine until it is called (many never are)
Dynamic linking: single copy of shared routines. When a user process needs to execute a routine, check to see if one is already in memory. If not, bring one it. If so, link to it.
Overlays: If you know for certain some routines will never be called after a certain point, and that other routines will not be called before that same point, load the new routines into space (memory) occupied by old routines.
Swapping: Move some processes to backing store so that their memory can be reassigned to other processes. At some point bring the swapped process back into memory.

These ideas are old. I thought they had gone away forever -- except old military/government machines. Many came back with the advent of PCs.

Memory Allocation Schemes

Many of these are no longer used, but these ideas build on one another. To understand where we are now is more easily explained if you see the evolution of the techniques. In most of what is discussed, you can see an evolution from one idea to the next.

Single Partitions:

        +-------------+ 0
        |             |
        |   OS        |
        |             |     OS usually loaded in low memory.
        +-------------+
        |             |
        |  Static     |
        |             |
        |  User       |
        |             |
        |  Space      |
        |             |
        +-------------+
        |  Dynamic    |
        |  User       |
        |  Space      |
        +-------------+
        |/////////////|
        |/////////////|
        |/////////////|
        |/////////////|
        |/////////////|
        |/////////////|
        |/////////////|
        |/////////////|
        +-------------+ Max Mem.

   
        +-------------+ 0
        |             |
        |   Static    |
        |   OS        |     OS usually loaded in low memory.
        |   Space     |
        +-------------+
        |   Dynamic   |
        |   OS Space  |  |
        +-------------+  |  Can grow as needed
        |/////////////|  V
        |/////////////|
        |/////////////|
        |/////////////|
        +-------------+
        |  Dynamic    |
        |  User       |  ^
        |  Space      |  |  Can grow as needed
        +-------------+  |
        |  Static     |
        |  User       |
        |  Space      |
        +-------------+ Max Mem.
pre>
With the invention of a Relocation Register:
                       +----------+             +---+
                       |       xxx| Rel. Reg.   |   |
                       +----------+             |   | physical
        +---+ logical       |   physical        |   |
        |CPU|-------------> + ----------------->|   | memory
        +---+ address           address         |   |
                                                |   |
                                                |   |
                                                +---+
     or base & limit registers:

             limit+------+     +----------+              +---+
              reg |      |     |       xxx| Rel. Reg.    |   |
                  +------+     +----------+              |   | physical
        +---+ logical | yes          |   physical        |   |
        |CPU|------>  < -----------> + ----------------->|   | memory
        +---+ address |              address             |   |
                      | no                               |   |
                      V                                  |   | 
                    error                                +---+

we have more flexibility.  This simplifies system software by delaying BINDING TIME to physical addresses.

Multiple Partitions
 
        +-------------+ 0
        |   OS        |
        |   Space     |
        +-------------+        Suppose P1 completes.  Then what?
        |   P1        |
        |             |          Scan job queue for next process to
        |             |          inititate.
        |             |            First fit
        +-------------+            Best fit
        |   P2        |            Worst fit
        |             |
        |             |         Reassignment of memory (after process
        +-------------+         complete) usually results in EXTERNAL
        |   P3        |         fragmentation.
        +-------------+
        |/////////////|
        +-------------+ Max. mem

pre
Can address fragmentation problems with garbage compaction/collection (if its worth the overhead).   This causes problems with I/O, DMA.

Paging
                                     address       +----+
                 logical   ----------------        |    |
                 address  /                V       |    |
       +---+      +---+---+          +---+---+     |    |  p:  offset into
       |CPU|------| p | d |          | f | d |---->|    |      page table
       +---+      +---+---+          +---+---+     |    |  d:  offset into
                    |                  ^           |    |      page
                    |       +---+      |           |    |
                    |     / |   |      |           |    |
                    |  p <  |   |      |           |    |
                    |     \ |   |      |           |    |
                     -----> | f | -----            +----+
                            |   |                  frames
                            |   |
                            +---+
                         page table

Advantage:  No external fragmentation.  The compiler need have no knowledge that what it treats as, say, a 32-bit address, is treated as a page table offset and a page offset.  Does not change this code.

Disadvantage:  Size of page tables.  INTERNAL fragmentation

HW support (TLB, etc)  To avoid two memory refs (remember, relative to CPU speeds, accesses to memory are slow), have small associative memory (also called associative registers and translation look-aside buffers).

Typically a 10% time increase in unmapped memory references.

Too many pages to keep entire page table in associative memory.  Only keep those actively referenced.

Example:

Suppose mem. access is 100 nanoseconds and it takes 10 nanoseconds to search assoc. mem.  Then if hit rate is 0.9, then
         effective access time = 0.9 * 110 + 0.1 * 210
                               = 120

(if hit rate were 0.98, then 112, not very different from 110)

The old Motorola 68030 had a 22 entry Translation Look-aside Buffer;

The old Intel 80486 had 32 TLB's, claimed a hit rate of 0.98.
 

The more modern schemes use cache memory with as many as 4096 TLB's.

This scheme supports some memory protection and shared memory.


Primary use of shared memory is to keep only one copy of executable memory resident; several different processes can all run the same copy.  Each has its own program counter.  Also requires a compiler which generates reentrant code: each process has its own memory location for all local variables.

Segmentation

Programs consist of memory related meaningful units.  pages (as above) are entirely artificial.  Say, ftn1, ftn2, memblock1, memblock2, ..., main.  (memblocks may be arrays, set of local variables associated with a particular procedure).

Each address can be of form ( segid, offset ) = ( s, d )
                             / +-----+-----+
                            |  |     |     |
                -------> s <   |     |     |
               |            |  |     |     |
     +---+     |             \ |limit| base|   +----+
     |CPU|--> (s,d)            |     |     |   |    |
     +---+       |             |     |     |   |    |
                 |             +-----+-----+   |    |
                 |               /      \      |    |
                 |              / yes    \     |    |
                  -----------> <  ------> + -->|    |
                               |               |    |
                               |no             |    |
                               v               |    |
                             trap              +----+

This can easily support memory sharing.

Advantage:  If routine never called, never loaded.

Problem:  Fragmentation (sige segments are of different sizes)

Paged Segmentation

Can combine segmentation and paging by taking the d of above and changing it to (p,d) as in paging.
     +---+   +---+---+---+
     |CPU|-->| s | p | d |
     +---+   +---+---+---+
               |   |
               |    ------------------------+----
               |                            V    |                    +---+
               |             ------------> >=    |                    |   |
               |            |               |    |                    |   |
               |        +------+----+       V    |                    |   |
               |       /|      |    |     trap   |                    |   |
               V s+r  | |      |    |            |    / +---+         |   |
               + ---- < |      |    |            |    | |   |         |   |
               ^      | |      |    |            |    | |   |         |   |
               |       \|length|ptbr|-----   t   V p+t< |   |         |   |
               | r      |      |    |     -----> +    | |   |  +-+-+  |   |
               |        |      |    |                 \ | f |- |f|d|->|   |
             STBR       +------+----+                   |   |  +-+-+  |   |
                        Segment table                   +---+         |   |
                                                                      |   |
                                                                      |   |
                                                                      +---+


(Finally) Virtual Memory

Motivations:

More, more, more!
Much memory is never referenced on a single execution of a program
Locality:  programs have strong tendency to repeatedly reference a collection of "close" memory locations, at least over the short run.


Virtual memory gives programs the appearance of access to much larger physical memory than really exists.  Performance depends on:

locality
fast access to backing store
special HW support


Pages
        +---+             +---+-+           +---+
      0 | A |           0 | 4 |x|         0 |   |
      1 | B |           1 |   | |         1 |   |
      2 | C |           2 | 6 |x|         2 |   |           -----------
      3 | D |           3 |   | |         3 |   |         /             \
      4 | E |           4 |   | |         4 | A |       |\              /|
      5 | . |           5 | 9 |x|         5 |   |       |  ------------  |
      6 | . |           6 |   | |         6 | C |       |  A      E      |
                                                        |                |
      .   .             .   .             .   .         |    BC          |
                                                        |                |
      n | . |           n|    | |         n |   |        \         D    /
        +---+            +----+-+           +---+          ------------
     Logical view      page    ^            main           backing store
     of memory         table   |            memory            (disk)
     (program view            in mem        \                           /
                              flag            ------------v-------------
                                                  Physical Memory

How do memory references works?

Handle address resolution just like paging, but
If required frame is not in main memory, treat as I/O interrupt (this event is called a page fault):

select frame location in main memory (free frame list)
schedule I/O transfer of frame from backing store to selected frame location.
place PCB in blocked queue
on I/O completion move PCB to ready queue

(what if no free frames?  then select victim, move victim's memory frame to backing store to free space, then fetch process's frame from disk.  More later)


As always, this isn't necessarily easy.  Depends in part on architecture of machine. 

Overhead of Demand Paging

When page fault occurs:

trap to OS
Save registers and state
Determine interrupt cause (assume page fault here)
Check for legal address
Determine location of page on disk
Select "victim" if no free frames.
Determine location of victim on disk.
Issue write request for victim.

wait on queue for device
wait seek & latency time
transfer


(this assume local disk.  if remote disk, must build TCP/IP packet,  wait for turn to get on Ethernet, decode TCP/IP packet at server (when server gets around to it; it may be busy with something else), then do a, b, c) Send msg back when complete)

Allocate CPU to another process (do context switch)
Interrupt from disk controller (or Ethernet controller)
Issue read req. for missing page a, b, c just like 8.  also if remote server then go through TCP/IP and Ethernet and another server).
Allocate CPU to another process (another context switch)
Interrupt for disk or Ethernet controller
Update page table
Reschedule process (place on Ready Queue)
Restore registers, etc.





Index
Previous
Next


Copyright ©2021 G. Hill Price

Send comments to G. Hill Price