FRESH

Hacker News

KVarN: Native vLLM backend for KV-cache quantization by Huawei

51 points by theanonymousone

by throwa356262

3 subcomments

Better performance than TQ and better quality than FP16?
Am I reading this right??

by v3ss0n

2 subcomments

by shockembopper

0 subcomment