Spaces takes a different approach. It uses 64KB-aligned slabs, and the metadata lookup is just a pointer mask (ptr & ~0xFFFF).
The trade-off is that every free() incurs an L1 cache miss to read the slab header, and there is a 64KB virtual memory floor per slab. But in exchange, you get zero-external-metadata regions, instant teardown of massive structures like ASTs, and performance that surprisingly keeps up with jemalloc on cross-thread workloads (I included the mimalloc-bench scripts in the repo).
It's Linux x86-64 only right now. I'm curious if systems folks think this chunk API is a pragmatic middle ground for memory management, or if the cache-miss penalty on free() makes the pointer-masking approach a dead end for general use.
```c ((PageSize) (chunk->pageSize - ((PageSize) ((PageSize) ((PageSize) (sizeof(Page) + (sizeof(struct _Block))) + (PageSize) ((sizeof(double)) - 1u)) & ((PageSize) (~((PageSize) ((sizeof(double)) - 1u)))))) - ((PageSize) ((PageSize) ((PageSize) ((sizeof(FreeBlock) + sizeof(PageSize))) + (PageSize) (((((sizeof(double)) > (4)) ? (sizeof(double)) : (4))) - ```
Edit: Homie. Why is bench.sh fetching external resources? Call me old fashioned, but it would be nice if when I cloned the repository (and checked out any submodules that may exist), I've got everything I need, right there.
I don't see how that could possibly be true. Sounds like a low-ball estimate.
Also i wish to point out that the "tcmalloc" being used as a baseline in these performance claims is Ye Olde tcmalloc, the abandoned and now community-maintained version of the project. The current version of tcmalloc is a completely different thing that the mimalloc-bench project doesn't support (correctly; I just checked).