FRESH
Hacker News
Home
KVarN: Native vLLM backend for KV-cache quantization by Huawei
51 points by theanonymousone
by throwa356262
3 subcomments
Better performance than TQ and better quality than FP16?
Am I reading this right??
by v3ss0n
2 subcomments
Why this is not a PR for vLLM ?
by shockembopper
0 subcomment
[dead]