When I ran this experiment it was pretty exhilarating for a while. Eventually it turned into QA testing the work of a bad engineer and became exhausting. Since I had sunken so much time into it I felt pretty bad afterwards that not only did the thing it made not end up being shippable, but I hadn't benefitted as a human being while working on it. I had no new skills to show. It was just a big waste of time.
So I think the "second way" is good for demos now. It's good for getting an idea of what something can look like. However, in the future I'll be extremely careful about not letting that go on for more than a day or two.
If AI edges humans out of the business of thinking, then we're all in deep shit, because it doesn't think, it just regurgitates previous human thinking. With no humans thinking, no advances in code will be possible. It will only be possible to write things which are derivatives of prior work
(cue someone arguing with me that everything humans do is a derivative of prior work)
There is an amusing parallel with his views on vibe coding. Back in the 90's and 2000's I noticed a pattern with the code developed by the huge influx of inexperienced programmers jumping on the dotcom bandwagon. The code can only be maintained by the same people who wrote it. There was no documentation, no intuition, no best practices and other wouldn't know where to fix if there is any issue. Probably the code aligned with the programmer's cultural habits and values (what's ok, what's not ok), which others might lack. Ironically, this has kind of provided job security for them, as it is difficult for others, to deal with that code.
I guess the LLMs are also into this "job-security" trick, by ensuring only LLMs can manage the LLMs generated code.
Anything that involves multiple days of work, or that you plan on working on it further, should absolutely not be vibe coded.
A) you'll have learnt pretty much nothing, or will retain nothing. Writing stuff by hand is a great way to remember. A painful experience worthwhile of having is one you've learnt from.
B) you'll find yourself distanced from the project and the lack personal involvement of 'being in the trenches' means you'll stop progressing on the software and move on back to something that makes you feel something.
Humans are by nature social creatures, but alone they want to feel worthwhile too. Vibe coding takes away from this positive reinforcement loop that is necessary for sticking with long running projects to achievement.
Emotions drive needs, which drives change and results. By vibe coding a significant piece of work, you'll blow away your emotions towards it and that'll be the end of it.
For 'projects' and things running where you want to be involved, you should be in charge, and only use LLMs for deterministic auto-completion, or research, outside of the IDE. Just like managing state in complex software, you need to manage LLMs' input to be 'boxed in' and not let it contaminate your work.
My 5c. Understanding the human's response to interactions with the machines is important in understanding our relationship with LLMs.
- Is the work easier to do? I feel like the work is harder.
- Is the work faster? It sounds like it’s not faster.
- Is the resulting code more reliable? This seems plausible given the extensive testing, but it’s unclear if that testing is actually making the code more reliable than human-written code, or simply ruling out bugs an LLM makes but a human would never make.
I feel like this does not look like a viable path forward. I’m not saying LLMs can’t be used for coding, but I suspect that either they will get better, to the point that this extensive harness is unnecessary, or they will not be commonly used in this way.
So I've been learning kotlin & android development in the evenings and i find this style of thing to be so much more effective as a dev practice than claude code and a better learning practice than following dev.to tutorials. I've been coding for almost 20 years and find most tutorial or documentation stuff either targeted to someone who has hardly programmed at all, or just plain old API docs.
Asking the langlemangler to generate a dev plan, focusing on idiomatic implementation details and design questions rather than lines of code, and to let me fill in the algorithm implementations, it's been nice. I'll use the jetbrains AI autocomplete stuff for little things or ask it to refactor a stinky function but mostly I just follow the implementation plan so that the shape of the whole system is in my head.
Here's an example:
> i have scaffolded out a new project, an implementation of a library i've written multiple times in the last decade in multiple languages, but with a language i haven't written and with new design requirements specified in the documentation. i want you to write up an implementation plan, an in-depth tutorial for implementing the requirements in a Kotlin Multi Platform library. > i am still learning kotlin but have been programming for 20 years. you don't need to baby me, but don't assume i know best practices and proper idioms for kotlin. make sure to include background context, best practices, idioms, and rationale for the design choices and separation of concerns.
This produced a 3kb markdown file that i've been following while I develop this project.
Can someone recommend any good resources on this? Google wasn't too helpful, or my google-fu is lacking.
It has a clear and specific definition. People just misuse and abuse the term.
Karpathy coined it to describe when you put a prompt into an LLM and then either run it or continue to develop on top of it without ever reviewing the output code.
I am unable to tell from TFA if the author has any knowledge or skills in programming and looked at the code or if they did in fact "vibe code".
That's not vibe coding. That is just AI assisted coding.
But I'm optimistic about the second way. I'm starting to think that TDD is going to be the new way we specify problems i.e by writing constraints, LLMs are going to keep hacking at those constraints until they're all satisfied, and periodically the temperature will have to be jiggled to knock the thing out of a loop.
The big back and forth between human and machine would be in the process of writing the constraints, which they will be bad at if you're doing anything interesting, and good at if you're doing something routine.
The big question for me is "Is there a way to write complete enough tests that any LLM would generate nearly the same piece of software?" And to follow up, can the test suite be the spec? Would that be an improvement on the current situation, or just as much work? Would that mean that all capable platforms would be abstracted? Does this mean the software improves on its own when the LLM improves, or when you switch to a better LLM, without any changes to the tests?
If the future is just writing tests, is there a better way to do it than we currently do? Are tests the highest-level language? Is this all just Prolog?
_It limps faster than you can walk_, in simple terms.
At each model release, it limps faster, but still can't walk. That is not a good sign.
> Do we want this?
No. However, there's a deeper question: do people even recognize they don't want this?
What's worked better for me: treating it like onboarding a contractor. Very specific, bounded tasks with clear acceptance criteria. The moment you're spending more time explaining context than it would take to just write the code yourself, that's the signal to switch back.
The overall effect is to use the word “test” as if it were a magical concept that you plaster onto your work to give it unearned prestige.
What the article demonstrates is that vibe coding is a way to generate orders of magnitude of complexity that no one in the world can understand and no one can take real responsibility for, even in principle.
I call it slop-coding, and I am happy to slop-code throwaway tools. I instruct Claude never to “test” anything I ask it to create, because I need to test it myself in order to apply it responsibly and feel close to it. If I want automated output checking (a waste of time with most tools I create), I can slop-code a framework for that, a la carte.
This way it burns fewer tokens of silly shallow testing.
Felt like I became a phd wannabe in 5 minutes
I think the quality of the product depends on the person (or people) responsible for it understanding the details.
What an absolutely abhorrent way of thinking. “I am interested in turning off my brain to create unstable Jenga towers of complexity that I’ll have no ability to fix when they inevitably fail”.
As if software in general isn’t a big enough pile of garbage already. One day, every single one of us will be seriously bitten by bugs created by this irresponsible approach.
Being able to understand what you build is a feature, not a bug.