FRESH Hacker News
Home
Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x
8 points by geoffbp