CS GRIND

Main Memory

Chapter Objectives

To provide a detailed description of various ways of organising memory hardware

To discuss various memory-management techniques, including paging and segmentation

To provide a detailed description of the Intel Pentium, which supports both pure segmentation and segmentation with paging

Background

Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory according to the value of the program counter. These instructions may cause additional loading from and storing to specific memory addresses.

Basic Hardware

Main memory and the registers are the only storage that the CPU can access directly. This reality leads to 2 implications:

Any instructions in execution, and any data being used by the instructions, must be in one of these direct-access storage devices.
If the data are not in memory, they must be moved there before the CPU can operate on them.

Difference between register and memory:

registers are built into the CPU and can be generally accessed within on cycle of the CPU clock. Most CPUs can decode instructions and perform simple operations on register contents at the rate of one or more operations per clock tick.
Memory is accessed via a transaction on the memory bus. Completing a memory access may take many cycles of the CPU clock.

Since accessing memory is slow, the processor needs to stall when accessing memory. This is intolerable and can be solved a memory buffer called cache.

Besides the accessing speed, the hardware also provides protection from mis-operations on memory. Here is one of the possible mechanism:

Each process has a separate memory space:
- base register: the smallest legal physical memory address
- limit register: the size of the range
CPU compares every address generated in the user mode with the registers. Any attempt by a program executing in user mode to access OS memory or other users' memory results in a trap to the OS, which treats the attempt as a fatal error.

Loading base and limit registers can only be performed by OS in kernel mode via privileged instructions.

OS, executing in kernel mode, can unrestrictedly access both OS memory and users' memory to dump out those programs in case of errors, to access and modify parameters of system calls, and so on.

Address Binding

Processes on the disk that are waiting to be brought into memory for execution form the input queue.

The binding of instructions and data to memory addresses:

Compile time: if you know at compile time where the process will reside in memory, then absolute code can be generated.
Load time: if it is not known at compile time where the process will reside in memory, then the compiler must generate relocatable code. In this case, final binding is delayed until load time.
Execution time: if the process can be moved during its execution from one memory segment to another, then binging must be delayed until run time. Need special hardware. Most general-purpose OS use this method.

Logical versus Physical Address Space

Logical address: addresses generated by the CPU
Physical address: addresses seen by the memory unit - that is, the one loaded into the memory-address register of the memory

The compile-time and load-time address-binding methods generate identical logical and physical addresses.

The execution-time address-binding scheme results in differing logical and physical addresses. In this case, we usually refer to the logical address as a virtual address.

Logical(virtual) address space: the set of all logical addresses generated by a program
Physical address space: the set of all physical addresses corresponding to the logical addresses

In the execution-time address-binding scheme, the logical and physical address spaces differ. The mapping from virtual to physical addresses in the execution-time is done by a hardware device called memory-management unit(MMU).

The user program never sees the real physical addresses. The user program deals with logical addresses. MMU maps these logical addresses to real physical addresses.

Dynamic Loading

Procedures:

a routine is not loaded until it is called. All routines are kept on disk in a relocatable load format
main program is loaded into memory and executed
When a routine needs to call another routine, the calling routine first checks to see whether the other routine has been loaded
If not, the relocatable linking loader is called to load the desired routine into memory and to update the program's address tables to reflect this change
Then control is passed to the newly loaded routine

Advantages:

an unused routine is never loaded
thus the total program size may be large, the portion that is used may be much smaller

Dynamic loading is up to users' implementation of their programs and does not require special OS support. OS may help the programmer by providing library routines to implement dynamic loading.

Dynamic Linking and Shared Libraries

Static linking: system language libraries are treated like any other object module and are combined by the loader into the binary program image
Dynamic linking: like dynamic loading. Loads the language routine only when it's necessary

Usage:

Dynamic linking avail all processes to use a language library execute only one copy of the library code.
A library may be replaced by a new version, and all programs that reference the library will automatically use the new version.
In case of accidentally using incompatible version of libraries, both program and library keep a version number. Programs only load the libraries with matching version number. In this case, there might be multiple versions of libraries loaded into memory.

This system is also known as shared libraries.

Dynamic linking usually requires help from the OS.

Swapping

A process must be in memory to be executed. A process, however, can be swapped temporarily out of memory to a backing store and then brought back into memory for continued execution.

Swapping in RR scheduling:

when a quantum expires, the memory manager will start to swap out the process that just finished
swap in another process into the memory space that has been freed
CPU scheduler will allocate a time slice to some other process in memory
when the CPU scheduler wants to reschedule the CPU, there should be some processes in memory, ready to execute; it's guaranteed by:
- memory manager swap processes fast enough
- the quantum is large enough to allow reasonable amounts of computing to be done between swaps

roll out, roll in:

When a higher-priority process arrives, the memory manager can swap out the lower-priority process and then load and execute the higher-priority process.
When the higher-priority process finishes, the lower-priority process can be swapped back in and continued.

Normally, a process that is swapped out will be swapped back into the same memory space it occupied previously. This restriction is dictated by the method of address binding.

If binding is done at assembly or load time, then the process cannot be easily moved to a different location.
If execution-time binding is being used, however, then a process can be swapped into a different memory space, because the physical addresses are computed during execution time.

The process of swapping:

OS maintains a ready queue consisting of all processes whose memory images are on the backing store or in memory and are ready to run.
Dispatcher checks whether next process is in memory
If not and if there is no free memory, swap out a process in memory
Swap in the desired process
Load and continue that process

Contiguous Memory Allocation

In contiguous memory allocation, each process is contained in a singlecontiguous section of memory.

Memory Mapping and Protection

CPU checks every generated memory address to protect both OS and the other users' programs and data from being modified.

Memory Allocation

Multiple-partition method: partition the memory into some fixed parts and each for a process.

Variable-partition: the operating system keeps a table indicating which parts of memory are available and which are occupied. Initially, all memory is available for user processes and is considered one large block of available memory, a hole.

First fit: allocate the first hole that is big enough
Best fit: Allocate the smallest hole that is big enough
Worst fit: Allocate the largest hole

Fragmentation

Internal fragmentation: unused memory that is internal to a partition
External fragmentation: when there is enough total memory space to satisfy a request but the available spaces are not contiguous:
- Compaction: Shuffle the memory contents so as to place all red memory together in one large block. This can be done only the relocation address is dynamic and done at execution time.
- Permit the logical address space of the processes to be noncontiguous.
  - Paging
  - Segmentation

Paging

avoids external fragmentation and the need for compaction
solve the problem of fitting memory chances of varying sizes onto the backing store:
- disk also suffers the same fragmentation problem as memory
- disk is much slower, so compaction is impossible

Basic Method

logical address is mapped into physical memory via page table.

Hardware Support

page-table base register
translation look-aside buffer(TLB)
- address-space identifiers
page-table length register

Protection

Shared pages

If the code is reentrant code(or pure code), it can be shared. Reentrant code is non-self-modifying code: it never changes during execution. Thus, two or more processes can execute the same code at the same time.

Structure of the Page Table

Hierarchical Paging

Useful in the situation that the page table is large -- instead of keeping a large chunk of continuous memory to story the large page table, divide the page table into smaller pieces.

Hashed Page Tables

Inverted Page Tables

Reentrant code is non-self-modifying code: it never changes during execution. Thus, two or more processes can execute the same code at the same time.

Segmentation

Segmentation: a memory-management scheme that supports the user's view of memory. A logical address space is a collection of segments. Each segment has a name and a length. The addresses specify both the segment name and the offset within the segment. The user therefore specifies each address by two quantities: a segment name and an offset.