compiler compiles program code and data and makes a guess about how big a heap and stack the program will need as it executes
the total size N of code, data, guessed heap, and guessed stack is put into the executable file produced by the compiler
the compiler has no idea about what actual physical RAM addresses the program will occupy as it runs so the compiler assumes the program will start at RAM address 0 through total (guessed) size N
the role of the memory manager is to map the process address space (code, data, heap, stack running from pretend addresses 0 to N-1) into actual RAM addresses M1 to M2
static (loadtime) binding (relocation): rewrite all addresses in the compiler-produced code by adding M1
dynamic (runtime) binding (relocation): use base and limit registers, base = M1, limit = N = M2 - M1 + 1
schemes to manage free areas of RAM to allocate for new processes
fixed number of partitions of fixed sizes
variable number of partitions of variable sizes
Communications of the ACM Volume 20, Issue 3 (March 1977) Pages: 191--192 Year of Publication: 1977 ISSN: 0001-0782 Author: Carter Bays, Univ. of South Carolina, Columbia Publisher: ACM, New York, NY, USA Use this link to bookmark this Article: http://doi.acm.org/10.1145/359436.359453 "Next-fit" allocation differs from first-fit in that a first-fit allocator commences its search for free space at a fixed end of memory, whereas a next-fit allocator commences its search wherever it previously stopped searching. This strategy is called "modified first-fit" by Shore [2] and is significantly faster than the first-fit allocator. To evaluate the relative efficiency of next-fit (as well as to confirm Shore's results) a simulation was written in Basic Plus on the PDP-11, using doubly linked lists to emulate the memory structure of the simulated computer. The simulation was designed to perform essentially in the manner described in [2]. The results of the simulation of the three methods show that the efficiency of next-fit is decidedly inferior to first-fit and best-fit when the mean size of the block requested is less than about 1/16 the total memory available. Beyond this point all three allocation schemes have similar efficiencies. Figure 1 shows the mean request size for (truncated) exponentially distributed requests versus E, the time-memory product efficiency as defined in [2] (the plot essentially corresponds to Figure 2 in [2] although naturally all of the points are somewhat different). The total memory size was 32,768 and the memory-residence time of requests was uniformly distributed between 5 and 15 time units. Each point represents the mean of approximately 20 runs, where the clock time for each run varied from 500 to 5000. The standard deviation of each point was less than .006. The results of this simulation support the hypothesis mentioned by Shore, that "when first-fit outperforms best-fit, it does so because first-fit, by preferentially allocating [blocks] toward one end of memory, encourages large blocks to grow at the other end." The fact that next-fit is decidedly inferior to both first-fit and best-fit implies that eliminating the preferential allocation of first-fit causes a loss of efficiency. REFERENCES 1 Donald E. Knuth, The art of computer programming, volume 1 (3rd ed.): fundamental algorithms, Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, 1997 2 John E. Shore, On the external storage fragmentation produced by first-fit and best-fit allocation strategies, Communications of the ACM, v.18 n.8, p.433-440, Aug. 1975
we have four restrictions or limitations limiting the flexibility of the memory manager
to solve the heap-stack collision problem, modify the compiler so it divides a program it is compiling into two pieces, code-data-heap and stack, and produces two address spaces, code-data-heap from 0 to N1-1 and stack from 231 to 231 + N2-1, where N1 and N2 are still guessed by the compiler
turn stack rightside up and start it at 231
starting the stack at 231 gives the CPU a way to distinguish code-data-heap addresses from stack addresses
two base registers baseCDH and baseS in the CPU instead of one; two limit registers limitCDH and limitS in the CPU instead of one; two base values and two limit values instead of one stored in the process table PCB for each process
format of machine instruction reading or writing data-heap using an address field and baseCDH, limitCDH for runtime hardware address translation
register 32 bit compiler
number generated address
--------------------------------------------
| | | |
| opcode | Rx | 0xxxxx...xxxxxx |
| | | |
--------------------------------------------
format of machine instruction reading or writing stack using an address field and baseS, limitS for runtime hardware address translation
register 32 bit compiler
number generated address
--------------------------------------------
| | | |
| opcode | Rx | 1xxxxx...xxxxxx |
| | | |
--------------------------------------------
when program is to be executed, the memory manager finds two big enough holes for the two address space pieces
if the heap reaches its limit, copy the code-data-heap to a bigger hole and adjust baseCDH and limitCDH
if the stack reaches its limit, copy the stack to a bigger hole and adjust baseS and limitS
but we still have three restrictions or limitations limiting the flexibility of the memory manager that uses the hole list (of variable sized holes) and needs to find two holes from the hole list to execute a program for a user
virtual memory implemented with paging eliminates the above three restrictions
here are some other concerns about RAM usage handled by virtual memory; the following are ways that an executing program, if loaded entirely in RAM when it begins executing, uses its RAM allocation inefficiently or wastefully
if a program can execute without all of its code and data in RAM all the time, the OS memory manager can use RAM more efficiently; in other words, a process can its RAM allocation more efficiently or less wastefully
the following stuff is explained in more detail in the ``Paging Fact Sheet''
pages of code-data-heap and stack address spaces and page frames of RAM, all of size 212 = 4096 bytes
each process has two page tables inside OS RAM, one for code-data-heap and one for stack
OS hole list becomes list of free page frames
to run a process, OS need load only some of code-data-heap and stack pages into free page frames
address translation done by CPU hardware
register 32 bit compiler
number generated address
1 19 12
-----------------------------------------------------
| | | | xxxxxxxxxx | xxxxxx |
| opcode | Rx | y | | |
| | | | page number | offset |
-----------------------------------------------------
page fault handled by OS software (page fault interrupt handler)
what if no free page frames on free page frame list?
local replacement, fixed allocation
global replacement, variable allocation
external versus internal fragmentation
spatial and temporal locality
TLB (translation lookaside buffer)
factors affecting choice of page size
swapping area on disk
home page:
http://elvis.rowan.edu/~hartley/index.html
e-mail:
hartley@elvis.rowan.edu