- Write a .claude/commands/review.md. Simple but deprecated.
- Use a /code-review skill, either one you install or one you just write yourself (it's just Markdown, after all).
- Use the /pr-review subagent. Also just Markdown, but it runs "in the background" and "in parallel", so it must be better, I guess.
- Install the /code-review plugin. This just installs the skills and subagents above.
- Simply ask Claude to review the code. Probably works almost as well as the above in most situations.
They are all just variations of "insert a canned prompt", varying only along the dimensions of (a) how and where the prompt is installed and from where it is sourced, and (b) which context or contexts the prompt runs in. There's not much advice here about which option is best, and no clear best practices seem to have emerged yet either. Personally, I find just asking Claude to review the code works well enough.
Some of the advice here is also off. For example:
"Install a language server plugin. Type errors and unused imports caught after every edit. Highest-impact plugin you can install."
I work mostly with Rust, Python, and Dart, and followed similar advice, installing LSPs for all three in both Claude Code and Codex. Two months later, after heavy development in all three languages and hundreds of sessions - and frequently running out of RAM due to all the Rust analyzer, Dart analysis server, and Ty LSP servers the harnesses were spinning up - I checked the session logs to see how often the agents were actually invoking the LSP tools. The answer was they had invoked them literally once the entire time. I uninstalled all my LSPs and haven't looked back. The agents do just fine using ripgrep and calling cargo clippy, dart analyze, ty check, etc. themselves.
- corporal threats of harm directly against Claude
- threats of prison for the entire board of directors of Anthropic
- explanation how every time it goes off the rails / makes mistakes, it gives more evidence to a class action lawsuit against Anthropic
Especially the latter two seem to have improved its "behaviour" to be more "careful" and "deliberate"
Still... I'm not ready to give it more autonomy. Even as it gets high-level things quite well, I still look at the code, give feedback, and have 3-4 rounds of tweaks until I'm happy with it, and also happy that I stil feel I have a good handle on the codebase.
In Claude I use /branch and /rename a lot (context checkpoints, fork, go back)
I use sandboxing almost exclusively: https://github.com/nix-tools/bubblebox -- it's a generalisation of Numtide's claudebox with a few fixes and some feature additions (more coming). This is best compared to always running your Claude in Docker containers, except there's no Docker runtime. Works fine in WSL and nix-darwin, too.
``` # Development Workflow
*Always use `bun`, not `npm`.*
# 1. Make changes
# 2. Typecheck (fast)
bun run typecheck
# 3. Run tests
bun run test -- -t "test name" # Single suite bun run test:file -- "glob" # Specific files
# 4. Lint before committing
bun run lint:file -- "file1.ts" bun run lint
# 5. Before creating PR
bun run lint:claude && bun run test ```
I have these things in pre-commit, this way the targets are always ran and the agent is forced to fix them (I ask claude to commit changes). The agents are erratic and very often skip these steps. Anything that can be deterministic I keep as scripts.
Regarding commits; both codex and claude are terrible at writing them. I have in my user CLAUDE.md:
``` Pattern: `type(scope): message` where type is `fix`, `feat`, `chore`, `docs`, `refactor`, or `style`; scope marks what is affected; message is a short lowercased description.
Keep subject and body lines under 72 characters. Always write a body explaining what, how, and why in continuous human-readable text. For fixes include the error message being fixed. No first-person speech. Re-read the actual git diff before writing — the message must describe what changed, not what was planned.
Use following command to create commit:
```bash git commit -F - <<'EOF' type(scope): subject line
Body paragraph explaining what, how, and why. EOF ```
```
Without it would write the body as a single long sentence; when asked to fix lines it would just insert \n (newlines), which were not respected and were instead just rendered as characters.
Another thing I find helpful is VOCABULARY.md. Very often the agent would assume (connect?) a different thing than what I had in mind, with VOCABULARY I make sure when I say "thing" claude and I have both the same "understading" (connection?) what "thing" is.
I always get the best results when I have live feedback with it.
Also, this stuff feels like alchemy to me . I bet some of you have the same feeling.
So what’s the recommendation for Claude to have a feedback loop?
Because it’s not what follows in the article: _“Explore, then plan, then code.”, “Use plan mode…”, “Reference, do not describe.”_
With this i mean there are some system prompts that make Claude very concerned about your autonomy.
I think in the future this type of system prompt will be embeded to force people to think a little.
/Frontend /API /ETL /DatabaseScripts
Whats the best way to organize this so Claude Code can work efficiently?
I found this one: do you guys know something else ?
Generally, and more so with paid products, one should expect to get something that is ready to be used, tuned by who's selling it at the best of their efforts. Instead, this is basically saying that the product is actually not much more than an empty box, and that it is your responsibility to augment it with third-party plugins and markdown texts that make it finally useful. And you better be carefully selecting the skills you install, you don't want to end up with second tier material made by GithubInfluencerA, you definitely need the work of GithubInfluencerB.
In the end, it's what is giving companies fuel to keep the hype running, because it allows to counter every possible argument or doubt about the technology, especially the ones made in good faith. No matter the problem you're facing, the blame is definitely on you, the user, for not setting up the tool in the right way.
I'm struggling in a lot of ways in accepting LLMs, but if I'll ever come completely sold on them and take this technology seriously, it won't be before this mood has gone away.
Also, how is "Explore, then plan, then code" considered "beyond the basics"?
Do yourself a favor and try Codex. Then do yourself an even bigger favor and try composer 2.5 from Cursor. It's night and day difference. You don't even have time to get distracted, you stay in the zone.
Beyond the issue of AI serfdom, I just don’t want so much of my workflow to depend on “some other company.”
This whole setup is basically setting you up to have all your projects in a Claude SaaS lock-in.
I also think if AI was actually smart it wouldn’t need so much handholding. I don’t want to spend my time developing skills and writing markdown files to try to get this dumb thing to write code for me. Why isn’t the AI reading the codebase and understanding what to do?
Because it’s artificial, that’s why.
Their conclusion: environment-layer containment first, then model-layer steering. CLAUDE.md is the right configuration layer but it is not a containment layer. Worth thinking about whether your worst case is a lost afternoon or a lost database and all backups deleted, too: https://safebots.ai/compromise.html
But the more important point are the costs. People are starting to realize just how costly it can be to run agents without precomputing and caching: https://safebots.ai/costs.html and self-orchestrating agents can go up to 1000x: https://safebots.ai/kimi.html
This is also how you get a slop codebase that you won’t easily understand.
It becomes a labyrinth that only the Agent knows. It’s not a catastrophe when your making prototypes or projects like you see on X.
But if you are expanding your codebase or trying to build something more professional and maintainable. I find it important to explicitly spec things bit by bit so I can understand and some what keep my writing style in this codebase. But this is only productive when you have a fast model otherwise it kills your chain of thought while you wait for the output.
If the model is slow, delegation is probably the only way.