GPU Memory Math for LLMs: Formula That Tells You What Fits on Your GPU
8 points by XMasterrrr
by DiabloD3
0 subcomment
This isn't very useful.
V of context is not equal across models.
Also, huggingface tells you how big the model is for the exact one you have in your hand, why the weird guesswork? Dynamic quants are not going to magically fit some formula.
by metadat
0 subcomment
This is super useful. Most of the time I go to run a model off Hugging Face on my 64GB MBP I run into issues where I drastically overestimated what it could do. :>