FRESH Hacker News
Home
Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint
89 points by charles_irl