FRESH

Hacker News

Prompt caching for cheaper LLM tokens

292 points by samwho

by est

1 subcomments

by Havoc

2 subcomments

Does anyone know whether the cache is segregated by user/API key for the big providers?
Was looking at modifying outgoing requests via proxy and wondering whether that's harming caching. Common coding tools presumably have a shared prompt across all their installs so universal cache would save a lot

by duggan

0 subcomment

It was a real facepalm moment when I realised we were busting the cache on every request by including date time near the top of the main prompt.
Even just moving it to the bottom helped move a lot of our usage into cache.
Probably went from something like 30-50% cached tokens to 50-70%.

by willvarfar

2 subcomments

A really clear explanation!
So if I were running a provider I would be caching popular prefixes for questions across all users. There must be so many questions that start 'what is' or 'who was' etc?
Also, can subsequences in the prompt be cached and reused? Or is it only prefixes? I mean, can you cache popular phrases that might appear in the middle of the prompt and reuse that somehow rather than needing to iterate through them token by token? E.g. must be lots of times that "and then tell me what" appears in the middle of a prompt?

by WillAdams

0 subcomment

When will Microsoft do this sort of thing?
It's a pain having to tell Copilot "Open in pages mode" each time it's launched, and then after processing a batch of files run into:
https://old.reddit.com/r/Copilot/comments/1po2cuf/daily_limi...

by holbrad

2 subcomments

I gave the table of inputs and outputs to both Gemini 3.0 flash and GPT 5.2 instant and they were stumped.
https://t3.chat/share/j2tnfwwful https://t3.chat/share/k1xhgisrw1

by dangoodmanUT

1 subcomments

by who-shot-jr

2 subcomments

by aitchnyu

1 subcomments

Took me a minute to see it is same Ngrok which provided freemium tunnels to localhost. How did they adapt to the AI revolution?

by tomhow

6 subcomments

[under-the-rug stub]
[see https://news.ycombinator.com/item?id=45988611 for explanation]

by NooneAtAll3

2 subcomments

Blog starts loading and then gives "Something Went Wrong. D is not a function" error displayed

by Youden

1 subcomments

Link seems to be broken: content briefly loads then is replaced with "Something Went Wrong" then "D is not a function". Stays broken with adblock disabled.