Static and dynamic approaches
There
are two basic approaches to memory usage: static and dynamic.
Static memory approaches assume that the addresses
don’t change. This may be a virtual memory illusion, or may be the actual
physical layout. The static memory allocation may be through absolute addresses
or through PC relative addresses (to allow for relocatable, reentrant, and/or
recursive software), but in either case, the compiler or assembler generates a
set of addresses that can not change for the life of a program or process.
Dynamic memory approaches assume that the
addresses can change (although change is often limited to predefined possible
conditions). The two most common dynamic approaches are the use of stack frames
and the use of pointers or handlers. Stack frames are used primarily for
temporary data (such as fucntion or subroutine variables or loop counters).
Handles and pointers are used for keeping track of dynamically allocated blocks
of memory.
Absolute addressing
To
look at memory use by programs and operating systems, let’s first examine the
more simple problem of a single program with complete control of the computer (such
as in a small-scale embedded system or the earliest days of computing).
The
most basic form of memory access is absolute addressing, in which the
program explicitely names the address that is going to be used. An address is a
numeric label for a specific location in memory. The numbering system is
usually in bytes and always starts counting with zero. The first byte of
physical memory is at address 0, the second byte of physical memory is at
address 1, the third byte of physical memory is at address 2, etc. Some
processors use word addressing rather than byte addressing. The theoretical
maximum address is determined by the address size of a processor (a 16 bit
address space is limited to no more than 65536 memory locations, a 32 bit
address space is limited to approximately 4 GB of memory locations). The actual
maximum is limited to the amount of RAM (and ROM) physically installed in the
computer.
A
programmer assigns specific absolute addresses for data structures and program
routines. These absolute addresses might be assigned arbitrarily or might have
to match specific locations expected by an operating system. In practice, the
assembler or complier determines the absolute addresses through an orderly
predictable assignment scheme (with the ability for the programmer to override
the compiler’s scheme to assign specific operating system mandated addresses).
This
simple approach takes advantage of the fact that the compiler or assembler can
predict the exact absolute addresses of every program instruction or routine
and every data structure or data element. For almost every processor, absolute
addresses are the fastest form of memory addressing. The use of absolute
addresses makes programs run faster and greatly simplifies the task of compiling
or assembling a program.
Some
hardware instructions or operations rely on fixed absolute addresses. For
example, when a processor is first turned on, where does it start? Most
processors have a specific address that is used as the address of the first
instruction run when the processer is first powered on. Some processors provide
a method for the start address to be changed for future start-ups. Sometimes
this is done by storing the start address internally (with some method for
software or external hardware to change this value). For example, on power up
the Motorola 680x0, the processor loads the interrupt stack pointer with the
longword value located at address 000 hex, loads the program counter with the
longword value located at address 004 hex, then starts execution at the frshly
loaded program counter location. Sometimes this is done by reading the start
address from a data line (or other external input) at power-up (and in this
case, there is usually fixed external hardware that always generates the same
pre-assigned start address).
Another
common example of hardware related absolute addressing is the handling of
traps, exceptions, and interrupts. A processor often has specific memory
addresses set aside for specific kinds of traps, exceptions, and interrupts.
Using a specific example, a divide by zero exception on the Motorola 680x0
produces an exception vector number 5, with the address of the exception
handler being fetched by the hardware from memory address 014 hex.
Some
simple microprocessor operating systems relied heavily on absolute addressing.
An example would be the MS-DOS expectation that the start of a
program would always be located at absolute memory address x100h (hexadecimal
100, or decimal 256). A typical compiler or assembler directive for this would
be the
ORG
directive (for “origin”).
The
key disadvantage of absolute addressing is that multiple programs clash with
each other, competing to use the same absolute memory locations for (possibly
different) purposes.
Overlay
So,
how do you implement multiple programs on an operating system using absolute
addresses? Or, for early computers, how do you implement a program that is
larger than available RAM (especially at a time when processors rarely had more
than 1k, 2k, or 4k of RAM? The earliest answer was overlay systems.
With
an overlay system, each program or program segment is loaded into the exact
same space in memory. An overlay handler exists in another area of memory and
is responsible for swapping overlay pages or overlay segments (both are the
same thing, but different operating systems used different terminology). When a
overlay segment completes its work or needs to access a routine in another
overlay segment, it signals the overlay handler, which then swaps out the old
program segment and swaps in the next program segment.
An
overlay handler doesn’t take much memory. Typically, the memory space that
contained the overlay handler was also padded out with additional routines.
These might include key device drivers, interrupt handlers, exception handlers,
and small commonly used routines shared by many programs (to save time instead
of continual swapping of the small commonly used routines).
In
early systems, all data was global, meaning that it was shared by and
available for both read and writes by any running program (in modern times,
global almost always means available to a single entire program, no longer
meaning available to all software on a computer). A section of memory was set
aside for shared system variables, device driver variables, and interrupt
handler variables. An additional area would be set aside as “scratch” or
temporary data. The temporary data area would be available for individual
programs. Because the earliest operating systems were batch systems, only one
program other than the operating system would be running at any one time, so it
could use the scratch RAM any way it wanted, saving any long term data to
files.
Relocatable
software
As
computer science advance, hardware started to have support for relocatable
programs and data. This would allow an operating system to load a program
anywhere convenient in memory (including a different location each time the
program was loaded). This was a necessary step for the jump to interactive
operating systems, but was also useful in early batch systems to allow for
multiple overlay segments.
Demand
paging and swapping
Overlay
systems were superceded by demand paging and swapping systems. In
a swapping system, the operating system swaps out an entire program and
its data (and any other context information).
In
a swapping system, instead of having programs explicitely request
overlays, programs were divided into pages. The operating system would load a
program’s starting page and start it running. When a program needed to access a
data page or program page not currently in main memory, the hardware would
generate a page fault, and the operating system would fetch the
requested page from external storage. When all available pages were filled, the
operating system would use one of many schemes for figuring out which page to
delete from memory to make room for the new page (and if it was a data page
that had any changes, the operating system would have to store a temporary copy
of the data page). The question of how to decide which page to delete is one of
the major problems facing operating system designers.
Program
counter relative
One
approach for making programs relocatable is program counter relative
addressing. Instead of branching using absolute addresses, branches
(including subroutine calls, jumps, and other kinds of branching) were based on
a relative distance from the current program counter (which points to the
address of the currently executing instruction). With PC relative addreses, the
program can be loaded anywhere in memory and still work correctly. The location
of routines, subroutines, functions, and constant data can be determined by the
positive or negative distance from the current instruction.
Program
counter relative addressing can also be used for determining the address of
variables, but then data and code get mixed in the same page or segment. At a
minimum, mixing data and code in the same segment is bad programming practice,
and in most cases it clashes with more sophisticated hardware systems (such as
protected memory).
Base
pointers
Base
pointers (sometimes
called segment pointers or page pointers) are special hardware registers that
point to the start (or base) of a particular page or segment of memory.
Programs can then use an absolute address within a page and either explicitly
add the absolute address to the contents of a base pointer or rely on the
hardware to add the two together to form the actual effective address of
the memory access. Which method was used would depend on the processor
capabilities and the operatign system design. Hiding the base pointer from the
application program both made the program easier to compile and allowed for the
operating system to implement program isolation, data/code isolation, protected
memory, and other sophisticated services.
As
an example, the Intel 80x86 processor has a code segment pointer, a data
segment pointer, a stack segment pointer, and an extra segment pointer. When a
program is loaded into memory, an operating system running on the Intel 80x86
sets the segment pointers with the beginning of the pages assigned for each
purpose for that particular program. If a program is swapped out, when it gets
swapped back in, the operating system sets the segment pointers to the new
memory locations for each segment. The program continues to run, without being
aware that it has been moved in memory.
Indirection, pointers, and handles
A
method for making data relocatable is to use indirection. Instead of
hard coding an absolute memory address for a variable or data structure, the
program uses a pointer that gives the memory address of the variable or
data structure. Many processors have address pointer registers and a variety of
indirect addressing modes available for software.
In
the most simple use of address pointers, software generates the effective
address for the pointer just before it is used. Pointers can also be stored,
but then the data can’t be moved (unless there is additional hardware support,
such as virtual memory or base/segment pointers).
Closely
related to pointers are handles. Handles are two levels of indirection,
or a pointer to a pointer. Instead of the program keeping track of an address
pointer to a block of memory that can’t be moved, the program keeps track of a
pointer to a pointer. Now, the operating system or the application program can
move the underlying block of data. As long as the program uses the handle
instead of the pointer, the operating system can freely move the data block and
update the pointer, and everything will continue to resolve correctly.
Because
it is faster to use pointers than handles, it is common for software to convert
a handle into a pointer and use the pointer for data accesses. If this is done,
there must be some mechanism to make sure that the data block doesn’t move
while the program is using the pointer. As an example, the Macintosh uses a system where data blocks can only be moved at
specific known times, and an application program can rely on pointers derived
from handles remaining valid between those known, specified times.
Stack
frames
Stack
frames are a method
for generating temporary variables, especially for subroutines, functions, and
loops. An are of memory is temporarily allocated on the system or process
stack. In a simple version, the variables in the stack frame are accessed by
using the stack pointer and an offset to point to the actual location in
memory. This simple approach has the problem that there are many hardware
instructions that change the stack pointer. The more sophisticated and stable
approach is to have a second pointer called a frame pointer. The frame
pointer can be set up in software using any address register. Many modern
processors also have specific hardware instructions that allocate the stack
frame and set up the frame pointer at the same time. Some processors have a
specific hardware frame pointer register.
Virtual
memory
Virtual
memory is a
technique in which each process generates addresses as if it had sole access to
the entire logical address space of the processor, but in reality memory
management hardware remaps the logical addresses into actual physical
addresses in physical address space. The DEC VAX-11 gets it name from
this technique, VAX standing for Virtual Address eXtension.
Virtual
memory can go beyond just remapping logical addresses into physical addresses.
Many virtual memory systems also include software for page or segment swapping,
shuffling portions of a program to and from a hard disk, to give the software
the impression of having much more RAM than is actually installed on the
computer.