FRESH Hacker News
Home
KVarN: Native vLLM backend for KV-cache quantization by Huawei
51 points by theanonymousone