-
Notifications
You must be signed in to change notification settings - Fork 0
Memory Manager
The current memory manager is in py/gc.c
and is almost a textbook implementation of mark-sweep GC.
Resources:
- wikipedia article on Trace/sweep garbage collection
When running for an extended period of time (usually after allocating large buffers), the heap can become very fragmented, and although there is enough free memory for an allocation, the free memory is not contiguous (all in a row). Such a problem severely limits uPy's use in critical and long-running applications (eg a web server, a quadcopter). Code running on a microcontroller needs to be a lot more robust than desktop software running on a PC, and so special care must be taken to design a robust memory allocator.
It's not clear how to improve uPy's current memory allocator so that fragmentation is reduced. Ultimately I would want to see a solution that completely eliminates fragmentation, so that programs are guaranteed to run indefinitely. Note that heap fragmentation is not because of the use of garbage collection: even with reference counting, or explicit free, you still have the same amount of fragmentation. The 2 ideas I have to eliminate fragmentation are: 1) a copying memory manager/garbage collector, that can move parts of the heap around so as to make contiguous regions available; 2) a fixed-block memory manager than allocates a single block size (eg 16 bytes), and if you want more you must chain multiple blocks (like a filesystem). Both of these ideas are difficult to implement, and will come with a speed and memory cost. But the cost will be worth it if we can get guaranteed memory allocation.
The main functions that a user needs are:
gc_alloc
-- malloc with gc
gc_free
-- free with gc
These are built on top of a memory structure using Allocation Table Bytes (ATBs)
Memory is split up into 4 Allocation Tables. Every Allocation Table has a ATB
which stands for "Allocation Table Byte". This is a single byte containing 4 sets of the following
// 0b00 = FREE -- free block
// 0b01 = HEAD -- head of a chain of blocks
// 0b10 = TAIL -- in the tail of a chain of blocks
// 0b11 = MARK -- marked head block
These are known as ATB_0
through ATB_3
and have several C methods (i.e. functions and macros) to access their attributes. These include:
BLOCKS_PER_ATB -- The number of ATB's that fit in an Allocation Table ATB_MASK_N -- Get the relevant bytes for ATB table N
ATB_N_IS_FREE(a) -- Determine whether table N is currently free
What do these do? Why are they useful??? BLOCK_SHIFT(block) ATB_GET_KIND(block) ATB_ANY_TO_FREE(block) ATB_FREE_TO_HEAD(block) ATB_FREE_TO_TAIL(block) ATB_HEAD_TO_MARK(block) ATB_MARK_TO_HEAD(block)
BLOCK_FROM_PTR(ptr) PTR_FROM_BLOCK(block) ATB_FROM_BLOCK(bl)
- Is there documentation for the above methods? I think that would help me understand what they do
- How is the memory structured? How is it gotten? Does python take out an array of data, or does it take out single elements at at time? I am having trouble understanding how this memory manager is supposed to be used.
- How many pointers can the memory manager handle at once (on your reference implementation, I believe 128k of memory)? Does the size of the data being requested matter? How does it deal with fragmentation? For instance, if the system asks for a few 100byte arrays and a few 5byte structs one after another, and then frees the 100byte arrays -- how does it handle the fragmentation? Is there a memory defragmenter?