> Agentic search avoids those failure modes. There's no embedding pipeline or centralized index to maintain as thousands of engineers commit new code. Each developer's instance works from the live codebase.
The frame of "the way a software engineer would" and the conclusion seem at odds. I'd love to be schooled otherwise?
I use autocomplete/LSPs all the time and they're useful. That's an index? Why wouldn't Claude be able to use one? Also a "software engineer" remembers the codebase - that's definitely a RAG. I have a lot of muscle memory to find the file I need through an auto-completed CMD+P.
It doesn't need to particularly be real-time across thousands of engineers -- just the branch I'm on.
It's rare that I'd be navigating a codebase from first-principles traversal. It would usually be a new codebase and in those cases it's definitely not what I'd call an optimal experience.
Claude's initial approach was really poor. One has to wonder how many times Claude code has to be modified/reviewed for improvement, or whether it is possible at all to make good code with it.
Edited: Generalization: Claude can fix a localized, identifiable poor decision (e.g., "only reading first 40 lines") because the fault is discrete and traceable to one piece of code.
But real software quality problems often arise from many small, individually reasonable decisions that collectively produce bad outcomes. No single one is obviously "the fault." In that scenario, a tool that generates low-quality building blocks piecemeal may never converge on good code, because each piece seems fine in isolation.
Simple - It even eats up to 35% five hour usage limit in first prompt even on small projects and then there's 5 minutes time out for you to respond quickly or caches would go bust and you'll pay another 12% to 15% on the next prompt.
I tried defining CLAUDE.md (or AGENTS.md), skills, plugins, but I'm not getting the effectiveness others claim to be. LSP plugin for example, CC doesn't to use LSP's symbol renaming and edits file one by one slowly, or it does not invoke the skill when I explicitly ask to remember to invoke when prompt contains a specific clue.
Am I using it wrong? Is there a robust example I can copy the harness?
What a strange comment for them to make. Why wouldn't I expect CC to work well with those languages? What languages would I associated it with? Python and Javascript?
- runs the test what is failing | grep "x|failing" | tail 10
- runs the test again to get the why it's failing message | tail 10
- runs the test again because tail 10 cut off the message
every time. What developers do things like this?!
I have a skill for it to not do that = save output for whatever test you run into file, read from file using whatever commands you want. Ignores the skill.
Same for debugging - something is failing. Instead of debugging given issue to see why it's failing, looking at the results it will look at the code trying to deduce why it's failing. First trace it finds that looks suspicious? "THAT'S IT, I FOUND IT. But let me reconsider." and after 15m it produces summary that is wrong. Put a debug point, look at it, then make your decisions. You have a skill to use for debugging that is phrased to do exactly that! No. I've never seen a human do things like this either.
It's maddening. It's as if, puts on tinfoil hat, it's designed to waste your tokens, while eventually accomplishing its task.
So, if I've read this post correctly, that means that CC is navigating my codebase today by sending lots of it up to a model, and building an understanding. Is that correct? Did I misunderstand it?
I kinda suspected there was more local inference going on somehow -- partly because the iteration times are fairly fast.
You need a code dependency graph: https://github.com/roboticforce/remembrallmcp Ask "what breaks if I change this?"
Saves 98% token usage. Saves 95% tools being called.
Runs as an MCP server, works for 8 languages.
It just works, you need to try it.
The article really does not align with the current sentiment. Everyone with a choice has mostly moved on to codex (ofc in this world all it takes is a model update/harness update to turn things around).
CC is great at a lot of things, but repeatedly misses out reading on crucial parts of the code base, hallucinates on the work that was done and a bunch of other issues.
I mean: If there was something you could add to the prompt to consistently increase performance why isn't it in the system prompt already?
If it's all about clarifying a couple of local idiosyncrasies, shouldn't it be able to quickly get them by looking through the repo?
Does anyone have an example of a CLAUDE.md that really makes a difference for them?
In general, this article would really have profited massively from examples of good applications of those patterns.
Are there any much more detailed walkthroughs of how it works and how it decides the tools to use and the grep to use etc and what the conversations actually look like?
In the UI you see just enough to know it’s doing something but you don’t really see the jumps it’s making offscreen.
The important distinction: CLAUDE.md will not explain how the model understands your architecture. Rather, it will prevent certain kinds of regression from happening. "Never create a user without calling the workspace provision step" is the right constraint. "This is how our entire system works" is not – the model learns it from the codebase.
The mistake is writing constraints based on an architecture constructed with slop. The sequence is important here.
A post like this should be providing people with some reassurance about Claude's ability to understand code at a large scale. It's mostly fluff.
Edit: so I did some googling to dig around for thoughts on LSP performance and integration. the author of bun has a tweet about saying that they are a big drag on performance for no real gain and virtually all of the replies agree. Anyone else have any experience/thoughts?
Meanwhile we are still waiting for these statements to come true:
https://eu.36kr.com/en/p/3648851352018565
https://www.businessinsider.com/anthropic-ceo-ai-90-percent-...
https://www.reddit.com/r/Anthropic/comments/1nemhxb/futurism...
https://medium.com/@coders.stop/dario-amodei-said-90-of-code...
https://www.youtube.com/shorts/0j1HqEEDThc
Accountability, anyone?