Basic memory software approaches


Static and dynamic approaches

    
There are two basic approaches to memory usage: static and dynamic.
    
Static memory approaches assume that the addresses don’t change. This may be a virtual memory illusion, or may be the actual physical layout. The static memory allocation may be through absolute addresses or through PC relative addresses (to allow for relocatable, reentrant, and/or recursive software), but in either case, the compiler or assembler generates a set of addresses that can not change for the life of a program or process.
    
Dynamic memory approaches assume that the addresses can change (although change is often limited to predefined possible conditions). The two most common dynamic approaches are the use of stack frames and the use of pointers or handlers. Stack frames are used primarily for temporary data (such as fucntion or subroutine variables or loop counters). Handles and pointers are used for keeping track of dynamically allocated blocks of memory.

Absolute addressing

    
To look at memory use by programs and operating systems, let’s first examine the more simple problem of a single program with complete control of the computer (such as in a small-scale embedded system or the earliest days of computing).
    
The most basic form of memory access is absolute addressing, in which the program explicitely names the address that is going to be used. An address is a numeric label for a specific location in memory. The numbering system is usually in bytes and always starts counting with zero. The first byte of physical memory is at address 0, the second byte of physical memory is at address 1, the third byte of physical memory is at address 2, etc. Some processors use word addressing rather than byte addressing. The theoretical maximum address is determined by the address size of a processor (a 16 bit address space is limited to no more than 65536 memory locations, a 32 bit address space is limited to approximately 4 GB of memory locations). The actual maximum is limited to the amount of RAM (and ROM) physically installed in the computer.
    
A programmer assigns specific absolute addresses for data structures and program routines. These absolute addresses might be assigned arbitrarily or might have to match specific locations expected by an operating system. In practice, the assembler or complier determines the absolute addresses through an orderly predictable assignment scheme (with the ability for the programmer to override the compiler’s scheme to assign specific operating system mandated addresses).
    
This simple approach takes advantage of the fact that the compiler or assembler can predict the exact absolute addresses of every program instruction or routine and every data structure or data element. For almost every processor, absolute addresses are the fastest form of memory addressing. The use of absolute addresses makes programs run faster and greatly simplifies the task of compiling or assembling a program.
    
Some hardware instructions or operations rely on fixed absolute addresses. For example, when a processor is first turned on, where does it start? Most processors have a specific address that is used as the address of the first instruction run when the processer is first powered on. Some processors provide a method for the start address to be changed for future start-ups. Sometimes this is done by storing the start address internally (with some method for software or external hardware to change this value). For example, on power up the Motorola 680x0, the processor loads the interrupt stack pointer with the longword value located at address 000 hex, loads the program counter with the longword value located at address 004 hex, then starts execution at the frshly loaded program counter location. Sometimes this is done by reading the start address from a data line (or other external input) at power-up (and in this case, there is usually fixed external hardware that always generates the same pre-assigned start address).
    
Another common example of hardware related absolute addressing is the handling of traps, exceptions, and interrupts. A processor often has specific memory addresses set aside for specific kinds of traps, exceptions, and interrupts. Using a specific example, a divide by zero exception on the Motorola 680x0 produces an exception vector number 5, with the address of the exception handler being fetched by the hardware from memory address 014 hex.
    
Some simple microprocessor operating systems relied heavily on absolute addressing. An example would be the MS-DOS expectation that the start of a program would always be located at absolute memory address x100h (hexadecimal 100, or decimal 256). A typical compiler or assembler directive for this would be the ORG directive (for “origin”).
    
The key disadvantage of absolute addressing is that multiple programs clash with each other, competing to use the same absolute memory locations for (possibly different) purposes.

 

Overlay

    
So, how do you implement multiple programs on an operating system using absolute addresses? Or, for early computers, how do you implement a program that is larger than available RAM (especially at a time when processors rarely had more than 1k, 2k, or 4k of RAM? The earliest answer was overlay systems.
    
With an overlay system, each program or program segment is loaded into the exact same space in memory. An overlay handler exists in another area of memory and is responsible for swapping overlay pages or overlay segments (both are the same thing, but different operating systems used different terminology). When a overlay segment completes its work or needs to access a routine in another overlay segment, it signals the overlay handler, which then swaps out the old program segment and swaps in the next program segment.
    
An overlay handler doesn’t take much memory. Typically, the memory space that contained the overlay handler was also padded out with additional routines. These might include key device drivers, interrupt handlers, exception handlers, and small commonly used routines shared by many programs (to save time instead of continual swapping of the small commonly used routines).
    
In early systems, all data was global, meaning that it was shared by and available for both read and writes by any running program (in modern times, global almost always means available to a single entire program, no longer meaning available to all software on a computer). A section of memory was set aside for shared system variables, device driver variables, and interrupt handler variables. An additional area would be set aside as “scratch” or temporary data. The temporary data area would be available for individual programs. Because the earliest operating systems were batch systems, only one program other than the operating system would be running at any one time, so it could use the scratch RAM any way it wanted, saving any long term data to files.

 

Relocatable software


As computer science advance, hardware started to have support for relocatable programs and data. This would allow an operating system to load a program anywhere convenient in memory (including a different location each time the program was loaded). This was a necessary step for the jump to interactive operating systems, but was also useful in early batch systems to allow for multiple overlay segments.

 

Demand paging and swapping

    
Overlay systems were superceded by demand paging and swapping systems. In a swapping system, the operating system swaps out an entire program and its data (and any other context information).
    
In a swapping system, instead of having programs explicitely request overlays, programs were divided into pages. The operating system would load a program’s starting page and start it running. When a program needed to access a data page or program page not currently in main memory, the hardware would generate a page fault, and the operating system would fetch the requested page from external storage. When all available pages were filled, the operating system would use one of many schemes for figuring out which page to delete from memory to make room for the new page (and if it was a data page that had any changes, the operating system would have to store a temporary copy of the data page). The question of how to decide which page to delete is one of the major problems facing operating system designers.

 

Program counter relative

    
One approach for making programs relocatable is program counter relative addressing. Instead of branching using absolute addresses, branches (including subroutine calls, jumps, and other kinds of branching) were based on a relative distance from the current program counter (which points to the address of the currently executing instruction). With PC relative addreses, the program can be loaded anywhere in memory and still work correctly. The location of routines, subroutines, functions, and constant data can be determined by the positive or negative distance from the current instruction.
    
Program counter relative addressing can also be used for determining the address of variables, but then data and code get mixed in the same page or segment. At a minimum, mixing data and code in the same segment is bad programming practice, and in most cases it clashes with more sophisticated hardware systems (such as protected memory).

 

Base pointers

    
Base pointers (sometimes called segment pointers or page pointers) are special hardware registers that point to the start (or base) of a particular page or segment of memory. Programs can then use an absolute address within a page and either explicitly add the absolute address to the contents of a base pointer or rely on the hardware to add the two together to form the actual effective address of the memory access. Which method was used would depend on the processor capabilities and the operatign system design. Hiding the base pointer from the application program both made the program easier to compile and allowed for the operating system to implement program isolation, data/code isolation, protected memory, and other sophisticated services.
    
As an example, the Intel 80x86 processor has a code segment pointer, a data segment pointer, a stack segment pointer, and an extra segment pointer. When a program is loaded into memory, an operating system running on the Intel 80x86 sets the segment pointers with the beginning of the pages assigned for each purpose for that particular program. If a program is swapped out, when it gets swapped back in, the operating system sets the segment pointers to the new memory locations for each segment. The program continues to run, without being aware that it has been moved in memory.

Indirection, pointers, and handles

    
A method for making data relocatable is to use indirection. Instead of hard coding an absolute memory address for a variable or data structure, the program uses a pointer that gives the memory address of the variable or data structure. Many processors have address pointer registers and a variety of indirect addressing modes available for software.
   
In the most simple use of address pointers, software generates the effective address for the pointer just before it is used. Pointers can also be stored, but then the data can’t be moved (unless there is additional hardware support, such as virtual memory or base/segment pointers).
    
Closely related to pointers are handles. Handles are two levels of indirection, or a pointer to a pointer. Instead of the program keeping track of an address pointer to a block of memory that can’t be moved, the program keeps track of a pointer to a pointer. Now, the operating system or the application program can move the underlying block of data. As long as the program uses the handle instead of the pointer, the operating system can freely move the data block and update the pointer, and everything will continue to resolve correctly.
    
Because it is faster to use pointers than handles, it is common for software to convert a handle into a pointer and use the pointer for data accesses. If this is done, there must be some mechanism to make sure that the data block doesn’t move while the program is using the pointer. As an example, the Macintosh uses a system where data blocks can only be moved at specific known times, and an application program can rely on pointers derived from handles remaining valid between those known, specified times.

 

Stack frames

    
Stack frames are a method for generating temporary variables, especially for subroutines, functions, and loops. An are of memory is temporarily allocated on the system or process stack. In a simple version, the variables in the stack frame are accessed by using the stack pointer and an offset to point to the actual location in memory. This simple approach has the problem that there are many hardware instructions that change the stack pointer. The more sophisticated and stable approach is to have a second pointer called a frame pointer. The frame pointer can be set up in software using any address register. Many modern processors also have specific hardware instructions that allocate the stack frame and set up the frame pointer at the same time. Some processors have a specific hardware frame pointer register.

 

Virtual memory

    
Virtual memory is a technique in which each process generates addresses as if it had sole access to the entire logical address space of the processor, but in reality memory management hardware remaps the logical addresses into actual physical addresses in physical address space. The DEC VAX-11 gets it name from this technique, VAX standing for Virtual Address eXtension.
    
Virtual memory can go beyond just remapping logical addresses into physical addresses. Many virtual memory systems also include software for page or segment swapping, shuffling portions of a program to and from a hard disk, to give the software the impression of having much more RAM than is actually installed on the computer.