https://screwjankgames.github.io/engine%20programming/2020/0...
https://www.bytesbeneath.com/p/the-arena-custom-memory-alloc...
I also don't know how much we want to butcher this blog post, but:
> RAM is fundamentally a giant array of bytes, where each byte has a unique address. However, CPUs don’t fetch data one byte at a time. They read and write memory in fixed-size chunks called words which are typically 4 bytes on 32-bit systems or 8 bytes on 64-bit systems.
CPUs these days fetch entire cache lines. Memory is also split into banks. There are many more details involved, and it is viewing memory as a giant array of bytes that is fundamentally broken. It's a useful abstraction up until some point, but it breaks apart once you analyze performance. This part of the blog didn't seem very accurate.
My remarks:
1) in calloc() you correctly check size*n against SIZE_MAX, but in multiple other spots you don't check size+META_SIZE.
2) the field is_mmap is useless: it can be replaced by (size>=MMAP_THRESHOLD) practically everywhere. The only corner case where this doesn't work is a large block initially backed by mmap() that's then shrunk via realloc() to under the threshold. But realloc() has other inconsistencies anyway, see next point.
3) realloc() shows the signs of a refactoring gone wild. The first if on block->size lacks a test on is_mmap, as split_block() doesn't seem to do the right thing with mmapped blocks...
4) free_list does not in fact track free nodes, as its name suggests, but all nodes, whether they are free or not. Wouldn't it be better to add a node to the list only when it's freed? I leave to you to iron out all the corner cases!
A couple of minor C points:
- The code seems to completely lack use of `static` for things that should be local to the implementation, such as `META_SIZE`, `find_free_block()` and others.
- The header includes `<stdbool.h>` but the interface doesn't use it so it could be included in the C file instead (which, in fact, it is!).
- Could do with more `const` for clarity, but that is quite personal.
- Interesting to use explicit comparison to check for integers being zero, but treat pointers being NULL as implicit boolean. I prefer comparing in both cases.
One thing worth calling out (building on what canyp mentioned) is that once you start caring about allocator performance, the conceptual model shifts away from “RAM as a byte array” toward “cache lines + page boundaries + TLB behavior”. A surprising amount of allocator behavior is dictated by how predictable your access patterns are to the L1/L2 caches rather than the actual heap layout.
In a previous toy allocator I built, the biggest wins came from aligning allocations to cache-line multiples and keeping metadata in a separate, tightly packed region. It reduced false sharing and made fragmentation surprisingly easier to reason about.
Curious whether you plan to explore thread safety next via per-thread arenas or a lock-free free-list - both approaches are fun rabbit holes.
https://pubs.opengroup.org/onlinepubs/7908799/xsh/brk.html
> The behaviour of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(), mmap(), free()). Other functions may use these other memory functions silently.
Since this is all allocator.h[0] contains aside from other include statements, why have allocator.h at all?
0 - https://github.com/t9nzin/memory/blob/main/include/allocator...
I don't think that's the case here though.
No need to free in short living processes