You need to be reasonably experienced and guide it.
First, you need to know that Claude will create nonsensical code. On a macro level it's not exactly smart it just has a lot of contextual static knowledge.
Debugging is not it's strongest skill. Most models don't do good at all. Opus is able to one-shot "troubleshooting" prompts occasionally, but it's a high probability that it veer of on a tangent if you just tell it to "fix things" based on errors or descriptions. You need to have an idea what you want fixed.
Another problem is that it can create very convincing looking - but stupid - code. If you can't guide it, that's almost guaranteed. It can create code that's totally backwards and overly complicated.
If it IS going on a wrong tangent, it's often hopeless to get it back on track. The conversation and context might be polluted. Restart and reframe the prompt and the problems at hand and try again.
I'm not totally sure about the language you are using, but syntax errors typically happens if it "forgets" to update some of the code, and very seldom just in a single file or edit.
I like to create a design.md and think a bit on my own, or maybe prompt to create it with a high level problem to get going, and make sure it's in the context (and mentioned in the prompts)
While this may be possible, it likely requires a very detailed prompt and/or spec document.
---
Here is an example of something I successfully built with Claude: https://rift-transcription.vercel.app
Apparently I have had over 150 chat sessions related to the research and development of this tool.
- First, we wrote a spec together: https://github.com/Leftium/rift-transcription/blob/main/spec...
- The spec broke down development into major phases. I reviewed detailed plans for each phase before Claude started. I often asked Claude to update these detailed plans before starting. And after implementation, I often had to have Claude fix bugs in the implementation.
- I tried to share the chat session where Claude got the first functional MVP working: https://opncd.ai/share/fXsPn1t1 (unfortunately the shared session is truncated)
---
"AI mistakes you're probably making": https://youtu.be/Jcuig8vhmx4
I think the most relevant point is: AI is best for accelerating development tasks you could do on your own; not new tasks you don't know how to do.
---
Finally: Cloudlflare builds OAuth with Claude and publishes all the prompts: https://hw.leftium.com/#/item/44159166
You haven't even said what programming language you're trying to use, or even what platform.
It sounds to me like you didn't do much planning, you just gave it a prompt to build away.
My preferred method of building things, and I've built a lot of things using Claude, is to have a discussion with it in the chatbot. The back and forth of exploring the idea gives you a more solid idea of what you're looking for. Once we've established the idea I get it to write a spec and a plan.
I have this as an instruction in my profile.
> When we're discussing a coding project, don't produce code unless asked to. We discuss projects here, Claude Code does the actual coding. When we're ready, put all the documents in a zip file for easy transfer (downloading files one at a time and uploading them is not fun on a phone). Include a CONTENTS.md describing the contents and where to start.
So I'll give you this one as an example. It's a Qwen driven System monitor.
https://github.com/lawless-m/Marvinous
here are the documents generated in chat before trying to build anything
https://github.com/lawless-m/Marvinous/tree/master/ai-monito...
At this point I can usually say "The instructions are in the zip, read the contents and make a start." and the first pass mostly works.
It seems you try to tell the tool to do everything in one shot. That is a very wrong approach, not just with Claude but with everything(you ask a woman for a date and if you do not get laid in five minutes you failed?). When I program something manually and compiles, I expect it to be wrong. You have to iron it and debug it.
Instead of that:
1.Divide the work in independent units. I call this "steps"
2.Subdivide steps into "subsets" You work in an isolated manner on those subsets.
3.Use an inmediate gui interface like dear imgui to prototype your tool. Translating then into using something else once it works is quite easy of LLMs.
4.Visualize everything. You do not need to see the code but you need to visualise every single thing you ask it to do.
5.Tell Claude what you want and why you want it and update the documentation constantly.
6. Use git in order to make rock solid steps that Claude will not touch when it works and you can revert changes or ask the ia to explore a branch, explaining how you did something and want something similar.
7. Do not modify code that already works rock solid. Copy it into another step leaving the step as reference and modify it there.
5.Use logs. Lots of logs. For every step you create text logs and you debug the problems giving Claude the logs to read them.
6.Use screenshots. Claude can read screenshots. If you visualise everything, clause can see the errors too.
7.Use asserts, lots of asserts, just like with manual programming.
It is not that different from managing a real team of people...
The LLM under the hood is essentially a very fancy autocomplete. This always needs to be kept in mind when working with these tools. So you have to focus a lot on what the source text is that’s going to be used to produce the completion. The better the source text, the better the completion. In other words, you need to make sure you progressively fill the context window with stuff that matters for the task that you’re doing.
In particular, first explore the problem space with the tool (iterate), then use the exploration results to plan what needs doing (iterate), when the plan looks good and makes sense, only then you ask to actually implement.
Claude’s built in planning mode kind of does this, but in my opinion it sucks. It doesn’t make iterating on the exploration and the plan easy or natural. So I suggest just setting up some custom prompts (skills) for this with instructions that make sense for the particular domain/use case, and use those in the normal mode.
I've been playing with it for almost two years now, and this is what gets me there. ChatGPT never got even close to it.
Two questions:
1. How are you using Claude? Are you using https://claude.ai and copying and pasting things back and forth, or are you running one of the variants of Claude Code? If so, which one?
2. If you're running Claude Code have you put anything in place to ensure it can test the code it's writing, including accessing screenshots of what's going on?
It will ask you questions, break down the project into smaller tasks, work through them one by one with UAT check points along the way.
It also handles managing your context
Think about AI the same way you'd think about trading courses: would you buy a course that promises 10,000% returns? If such returns were possible, the course seller would just trade instead of selling courses.
Same logic here - if "vibe-coding" really worked at scale, Claude would be selling software, not tokens.
Good luck!
Read through anthropics knowledge share, check out their system prompts extracted on github, write more words in AGENTS/CLAUDE.md, you need to give them some warmup to do better at tasks.
What model are you using? Size matters and Gemini is far better at UI design work. At the same time, pairing gemini-3-flash with claude-code derived prompts makes it nearly as good as Pro
Words matter, the way you phrase something can have disproportionate effect. They are fragile at times, yet surprisingly resilient at others. They will deeply frustrate you and amaze you on a daily basis. The key is to get better at recognizing this earlier and adjusting
You can find many more anecdotes and recommendations by looking through HN stories and social media (Bluesky has a growing Ai crowd, coming over from X, good community bump recently, there are an anti-ai labelers/block lists to keep the flak down)
https://medium.com/@josh.beck2006/i-vibe-coded-a-cryptocurre...
There are 3 major steps:
(Plan mode)
1. assuming this is an existing codebase, load the relevant docs/existing code into context (usually by typing @<PATH>
2. Ask it to make a plan for the feature you want to implement. Assuming you’ve already put some thought into this, be as specific and detailed as you can. Ask it to build a plan that’s divided into individually variable steps. Read the plan file that it spits out, correct and bad assumptions it made, ask it questions if you’re unclear one what it’s saying, refine, etc.
(agent mode) Ask it to build the plan, one step at a time. After it builds each step verify that it’s correct, or have it help you verify it’s correct in a way you can observe.
I have been following this basic process mostly with Opus 4.5 in a mixture of claude code and cursor working on a pretty niche image processing pipeline (also some advanced networking stuff on the side) and have hand-written basically zero code.
People say - “your method sounds like a lot of work too” and that’s true, it is still work, but designing at a high level how I want some CUDA kernel to work and how it fits into the wider codebase and then describing it in a few sentences is still much faster than doing all of the above anyway and then hand writing 100 lines of CUDA (which I don’t know that well).
I’d conservatively estimate that i’ve made 2x the progress in the same amount of time as if I had been doing this without LLM tools.
* have Claude produce wireframes of the screens you want. Iterate on those and save them as images.
* then develop. Make sure Claude has the ability to run the app, interact with controls, and take screenshots.
* loop autonomously until the app looks like the wireframes.
Feedback loops are required. Only very simple problems get one-shot.
In my little experience, what I've seen work is that you need to provide a lot of constraints in the form of:
- Scope: Don't build a website, but build a feature (either user facing or infra, it doesn't matter). I've found that chunking my prompts in human-manageable tasks that would take 0.5-1 day, is enough of a scale down.
- Docs .md files that describe how the main parts of the application work, what a component/module/unit of code looks like, what tools&technologies to use (and links to the latest documentation and quickstart pages). You should commit these to code and update them with every code change (which with Claude is just a reminder in each prompt).
- Existing code, if it's not a greenfield project.
It really moves away from the advertised paradigm of one-shot vibe-coding but since the quality of the output is really good these days, this long preparation will give you a production ready output much sooner than with traditional methods.
This reminds me of someone who dropped into #java on undernet once upon a time in the 90s. "I can't get it to work" , and we kept trying to debug, and for some reason we kept hitting random new walls. It just never would work! Turns out that they were deleting their .java file and starting over each time. Don't do that.
---
Take it as a sequence of exercises.
Maybe start like this:
Don't use claude code at all to begin with. It's a pair programming exercise, and you start at the keyboard, where you're confident and in control. Have claude open in the web interface alongside, talk through the design with it while working; and ask to google stuff for you, look up the api, maybe ask if it remembers the best way(s) to approach the problem. Once you trust it a bit, maybe ask for code snippets or even entire functions. They can't be 100% correct because it doesn't have context... you might need to paste in some code to begin with. When there's errors, paste them in, maybe you'll get advice.
If you're comfy? Switch seats, start using claude code. Now you're telling claude what to do. And you can still ask the same questions you were asking before. But now you don't need to paste into the web interface anymore, and the AI sure as heck can type faster than you can.
Aren't you getting tired of every iteration where you're telling the AI "this went wrong", " that went wrong"? Maybe make sure there's a way for the AI to test stuff itself, so it can iterate a few cycles automatically. Your LLM can iterate through troubleshooting steps faster than you can type the first one. Still... keep an eye on it.
And, really that's about where I am now.
My pattern matching brain says this is normal for hype. It's a good product, but no where near to the level you read about in some places (like HN in this case)
If you're building a web app, give it a script that (re)starts the full stack, along with Playwright MCP or Chrome DevTools MCP or agent-browser CLI or something similar. Then add instructions to CLAUDE.md on how and when to use these tools. As in: "IMPORTANT: You must always validate your change end-to-end using Playwright MCP, with screenshot evidence, before reporting back to me that you are finished.".
You can take this further with hooks to more forcefully enforce this behavior, but it's usually not necessary ime.
The idea is, you want to build up the right context before starting development. I will either describe exactly what I want to build, or I ask the agent for guidance on different approaches. Sometimes I’ll even do this in a separate Claude (not Claude Code) conversation, which I feel works a bit faster. Once we have an approach, I will ask it to create an implementation plan in a markdown file, I clear context and then tell it to implement the plan.
Check out the “brainstorming” skill and the “git worktrees” skill. They will usually trigger the planning -> implementation workflow when the work is complex enough.
To answer the question I would highlight the wrong regions in neon green manually via code. Now feed the code (zipped if necessary) to the AI along with a screenshot. Now give it relatable references for the code and say "xxxx css class/gtk code/whatever is highlighted in the screenshot in neon. I expect it to be larger but it's not, why?"
You still need knowledge of what you are building so you can drive it, guide it, fix things.
This is the core of the question about LLM assisted programming - what happens when non programmers use it?
I wanted to tear my ears out.
What is crystal clear to me now is using LLMs to develop is a learned and practiced skill. If you expect to just drop in and be productive on day one, forget it. The smartest guy I know _who has a PhD in AI_, is hopeless at using it.
Practice practice practice. It's a tool, it takes practice. Learn on hobby projects before using it at work.
CC was slow and the results I was getting were subpar having it debug some easy systems tasks. Later in the afternoon it recovered and was able to complete all my tasks. There’s another aspect to these coding agents: the providers can randomly quantize (lobotomize) models based on their capacity, so the model you’re getting may not be the one someone else is getting, or the same model you used yesterday.
Then, repeatedly ask Claude to criticize the plan and use the "AskUserQuestion" tool to ask for your input.
Keep criticizing and updating the plan until your gut says Claude is just trying to come up with things that aren't actually issues anymore.
Then unleash it (allow edits) and see where you get. From there you may ask for one off small edits. Or go back into plan mode again
For example, if you tell it to compile and run tests, you should never be in a situation with syntax errors.
But if you don’t give a prompt that allows to validate the result, then it’s going to get you whatever.
1. Good for proof of concepts, prototypes but nothing that really goes to heavy production usage 2. Can make some debugging and fixing that usually requires looking the stack, look the docs and check the tree 3. Code is spaghetti all way down. One might say it is ok because it is fast to generate, but the bigger the application, every change gets more expensive and it always forget to do something. 4. Tests it generates is mostly useless. 9/10 times it always passes on all tests it creates for itself but the code does not even start. No matter what type of test. 5. Frequently lied about the current state of the code and only when pushed it will admit it was wrong.
As others said, it is a mix of the (misnomer) Danny Kruger effect and some hype.
I tried possibly every single trick to get it working better but I feel most are just tricks. They are not necessarily making it work better.
It is not completely useless, my work involves doing prototypes now and then and usually they need to be quite extensive. For that it has been a help. But I don't feel it is close to what they sell
It is much better than other models I have tried. Didn't think the post would blow up so much tbh..
Also I suggest giving it low-level instructions. Its half-decent for low level stuff especially if it has access to preexisting code. Also note that it does exactly what you tell it to do like a genie. I've asked it to write a func that already exists in the codebase and it wrote a massive chunk of code. It wasn't until after it was done that I remembered we already have the solution to the problem done. Anyhow the hype is unreal so tailor expectations accordingly.
syntax error is nothing, I just paste the error into the tui and it fixes it usually.
from the basics, did you actually tell it that you want those things? its not a mind reader. did you use plan mode? did you ask it to describe what its going to make?
If you treat it as an astonishingly sophisticated and extremely powerful autocomplete (which it is) - you have plenty of opportunities to make your life better.
and then try again.
There used to be more or less one answer to the question of "how do I implement this UI feature in this language"
Now there are countless. Welcome to the brave new world of non-deterministic programming where the inputs can produce anything and nothing is for certain.
Everyone promises it can do something different if you "just use it this way".
First prompt, ask it to come with a plan, break it down to steps and save it to a file.
Edit file as needed.
Launch CC again, use the plan file to implement stage by stage, verify and correct. No technical debugging needed. Just saying X is supposed to be like this, but it’s actually like that goes a long way.
for now, anyway.