- Don’t presume this study has anything to do with programming. They measured an agent’s ability to search long conversations, not code.
> We evaluate on a 116-question representative subset of the LongMemEval benchmark (Wu et al., 2025), which tests an agent’s ability to answer questions over long conversations spanning multiple sessions.
by alexrigler
0 subcomment
- Combining regex filtering with semantic ranking using multi-vector embeddings has yielded good results for me. I use ColGREP from the LightOn team asa daily driver - https://github.com/lightonai/next-plaid/blob/main/colgrep/RE...
by stephantul
0 subcomment
- This paper oversells on the title. Like, what is chronos, which embedding model was used, which reranker, how was the reranking done, why is chronos much better than claude code
- I recently watched the new Palantir + Kirkland & Ellis fund formation platform demo, and I was surprised to see how effective the union of structured data was in an agent harness. We're used to dealing with flat files and comparing here basic ways of searching, essentially, long strings, but using Palantir's "Ontology" graph framework, I think Kirkland is going to be able to achieve some exception and differentiating outcomes in legal tech. The whole idea assumes that they've got great structured data already, and perhaps that's the real valuable unknown, but giving an agent those tools is super powerful.
I wrote about it[1] and came away with a different view on both Palantir and the future of agentic workflows personally.
[1] sorry, LinkedIn: https://www.linkedin.com/pulse/fund-managements-killer-app-d...
- This is a surprising result. With structured inputs like source code, I’d expect grep to outperform semantic search, but natural language’s errors and inconsistencies seem to leave so many cracks for information to fall through.
- I have always used traditional grep to search codebases. It serves me better than an IDE when there’re lots of scattered and frequent queries.
grep’s design is surprisingly winning, exceeding expectations to this day.
by jeffchuber
4 subcomments
- If you are truly bitter-lesson pilled - give the agent all the tools and let it decide which to use.
- regex (grep)
- hybrid search (bm25+vector)
this X vs Y is uninteresting when the answer can be both.
by hmokiguess
4 subcomments
- Tangential, I have a hook that rewriters grep to rg but lately I wonder if this is actually wasteful as the model is so biased to grep, is there a way to shim/alias perhaps?
- Is <blank> the only ML paper title?
- I'm curious to see what patterns it's grepping.
- Feels important, but I wish they also had compared against something like MeiliSearch or Algolia.
- Surely 'strings' would be even better?
by greenavocado
0 subcomment
- This has been posted before, but a dead-simple pattern that helps enormously with steering the model to the right code area is a DESIGN.md that it creates, updates, and references periodically.
by tailor_gunjan93
0 subcomment
- [flagged]
- [flagged]
by wseadowntown
0 subcomment
- [dead]