Main Memory
Chapter Objectives
- To provide a detailed description of various ways of organising memory hardware
- To discuss various memory-management techniques, including paging and segmentation
- To provide a detailed description of the Intel Pentium, which supports both pure segmentation and segmentation with paging
Background
Memory consists of a large array of words or bytes, each with its own address. The CPU fetches instructions from memory according to the value of the program counter. These instructions may cause additional loading from and storing to specific memory addresses.
Basic Hardware
Main memory and the registers are the only storage that the CPU can access directly. This reality leads to 2 implications:
- Any instructions in execution, and any data being used by the instructions, must be in one of these direct-access storage devices.
- If the data are not in memory, they must be moved there before the CPU can operate on them.
Difference between register and memory:
- registers are built into the CPU and can be generally accessed within on cycle of the CPU clock. Most CPUs can decode instructions and perform simple operations on register contents at the rate of one or more operations per clock tick.
- Memory is accessed via a transaction on the memory bus. Completing a memory access may take many cycles of the CPU clock.
Since accessing memory is slow, the processor needs to stall when accessing memory. This is intolerable and can be solved a memory buffer called cache.
Besides the accessing speed, the hardware also provides protection from mis-operations on memory. Here is one of the possible mechanism:
- Each process has a separate memory space:
- base register: the smallest legal physical memory address
- limit register: the size of the range
- CPU compares every address generated in the user mode with the registers. Any attempt by a program executing in user mode to access OS memory or other users' memory results in a trap to the OS, which treats the attempt as a fatal error.
Loading base and limit registers can only be performed by OS in kernel mode via privileged instructions.
OS, executing in kernel mode, can unrestrictedly access both OS memory and users' memory to dump out those programs in case of errors, to access and modify parameters of system calls, and so on.
Address Binding
Processes on the disk that are waiting to be brought into memory for execution form the input queue.
The binding of instructions and data to memory addresses:
- Compile time: if you know at compile time where the process will reside in memory, then absolute code can be generated.
- Load time: if it is not known at compile time where the process will reside in memory, then the compiler must generate relocatable code. In this case, final binding is delayed until load time.
- Execution time: if the process can be moved during its execution from one memory segment to another, then binging must be delayed until run time. Need special hardware. Most general-purpose OS use this method.
Logical versus Physical Address Space
- Logical address: addresses generated by the CPU
- Physical address: addresses seen by the memory unit - that is, the one loaded into the memory-address register of the memory
The compile-time and load-time address-binding methods generate identical logical and physical addresses.
The execution-time address-binding scheme results in differing logical and physical addresses. In this case, we usually refer to the logical address as a virtual address.
- Logical(virtual) address space: the set of all logical addresses generated by a program
- Physical address space: the set of all physical addresses corresponding to the logical addresses
In the execution-time address-binding scheme, the logical and physical address spaces differ. The mapping from virtual to physical addresses in the execution-time is done by a hardware device called memory-management unit(MMU).
The user program never sees the real physical addresses. The user program deals with logical addresses. MMU maps these logical addresses to real physical addresses.
Dynamic Loading
Procedures:
- a routine is not loaded until it is called. All routines are kept on disk in a relocatable load format
- main program is loaded into memory and executed
- When a routine needs to call another routine, the calling routine first checks to see whether the other routine has been loaded
- If not, the relocatable linking loader is called to load the desired routine into memory and to update the program's address tables to reflect this change
- Then control is passed to the newly loaded routine
Advantages:
- an unused routine is never loaded
- thus the total program size may be large, the portion that is used may be much smaller
Dynamic loading is up to users' implementation of their programs and does not require special OS support. OS may help the programmer by providing library routines to implement dynamic loading.
Dynamic Linking and Shared Libraries
- Static linking: system language libraries are treated like any other object module and are combined by the loader into the binary program image
- Dynamic linking: like dynamic loading. Loads the language routine only when it's necessary
Usage:
- Dynamic linking avail all processes to use a language library execute only one copy of the library code.
- A library may be replaced by a new version, and all programs that reference the library will automatically use the new version.
- In case of accidentally using incompatible version of libraries, both program and library keep a version number. Programs only load the libraries with matching version number. In this case, there might be multiple versions of libraries loaded into memory.
This system is also known as shared libraries.
Dynamic linking usually requires help from the OS.
Swapping
A process must be in memory to be executed. A process, however, can be swapped temporarily out of memory to a backing store and then brought back into memory for continued execution.
Swapping in RR scheduling:
- when a quantum expires, the memory manager will start to swap out the process that just finished
- swap in another process into the memory space that has been freed
- CPU scheduler will allocate a time slice to some other process in memory
- when the CPU scheduler wants to reschedule the CPU, there should be some processes in memory, ready to execute; it's guaranteed by:
- memory manager swap processes fast enough
- the quantum is large enough to allow reasonable amounts of computing to be done between swaps
roll out, roll in:
- When a higher-priority process arrives, the memory manager can swap out the lower-priority process and then load and execute the higher-priority process.
- When the higher-priority process finishes, the lower-priority process can be swapped back in and continued.
Normally, a process that is swapped out will be swapped back into the same memory space it occupied previously. This restriction is dictated by the method of address binding.
- If binding is done at assembly or load time, then the process cannot be easily moved to a different location.
- If execution-time binding is being used, however, then a process can be swapped into a different memory space, because the physical addresses are computed during execution time.
The process of swapping:
- OS maintains a ready queue consisting of all processes whose memory images are on the backing store or in memory and are ready to run.
- Dispatcher checks whether next process is in memory
- If not and if there is no free memory, swap out a process in memory
- Swap in the desired process
- Load and continue that process
Contiguous Memory Allocation
In contiguous memory allocation, each process is contained in a singlecontiguous section of memory.
Memory Mapping and Protection
CPU checks every generated memory address to protect both OS and the other users' programs and data from being modified.
Memory Allocation
Multiple-partition method: partition the memory into some fixed parts and each for a process.
Variable-partition: the operating system keeps a table indicating which parts of memory are available and which are occupied. Initially, all memory is available for user processes and is considered one large block of available memory, a hole.
- First fit: allocate the first hole that is big enough
- Best fit: Allocate the smallest hole that is big enough
- Worst fit: Allocate the largest hole
Fragmentation
- Internal fragmentation: unused memory that is internal to a partition
- External fragmentation: when there is enough total memory space to satisfy a request but the available spaces are not contiguous:
- Compaction: Shuffle the memory contents so as to place all red memory together in one large block. This can be done only the relocation address is dynamic and done at execution time.
- Permit the logical address space of the processes to be noncontiguous.
Paging
- avoids external fragmentation and the need for compaction
- solve the problem of fitting memory chances of varying sizes onto the backing store:
- disk also suffers the same fragmentation problem as memory
- disk is much slower, so compaction is impossible
Basic Method
logical address is mapped into physical memory via page table.
Hardware Support
- page-table base register
- translation look-aside buffer(TLB)
- address-space identifiers
- page-table length register
Protection
Shared pages
If the code is reentrant code(or pure code), it can be shared. Reentrant code is non-self-modifying code: it never changes during execution. Thus, two or more processes can execute the same code at the same time.
Structure of the Page Table
Hierarchical Paging
Useful in the situation that the page table is large -- instead of keeping a large chunk of continuous memory to story the large page table, divide the page table into smaller pieces.
Hashed Page Tables
Inverted Page Tables
Reentrant code is non-self-modifying code: it never changes during execution. Thus, two or more processes can execute the same code at the same time.
Segmentation
Segmentation: a memory-management scheme that supports the user's view of memory. A logical address space is a collection of segments. Each segment has a name and a length. The addresses specify both the segment name and the offset within the segment. The user therefore specifies each address by two quantities: a segment name and an offset.
Summary