I recently used a coding agent on a project where I was using an unfamiliar language, framework, API, and protocol. It was a non-trivial project, and I had to be paying attention to what the agent was doing because it definitely would go off into the weeds fairly often. But not having to spend hours here and there getting up to speed on some mundane but unfamiliar aspect of the implementation really made everything about the experience better.
I even explored some aspects of LLM performance: I could tell that new and fast changing APIs easily flummox a coding agent, confirming the strong relationship of up-to-date and accurate training material to LLM performance. I've also seen this aspect of agent assisted coding improve and vary across AIs.
vs. what this author is doing, which seems more like agent assisted coding than "vibe" coding.
With regard to the subject matter, it of course makes sense that managing more features than you used to be able to manage without $AI_MODEL would result in some mental fatigue. I also believe this gets worse the older you get. I've seen this within my own career, just from times of being understaffed and overworked, AI or not.
I've found that if an LLM writes too much code, even if I specified what it should be doing, I still have to do a lot of validation myself that would have been done while writing the code by hand. This turns the process from "generative" (haha) to "processing", which I struggle a lot more with.
Unfortunately, the reason I have to do so much processing on vibe code or large generated chunks of code is simply because it doesn't work. There is almost always an issue that is either immediately obvious, like the code not working, or becomes obvious later, like poorly structured code that the LLM then jams into future code generation, creating a house of cards that easily falls apart.
Many people will tell me that I'm not using the right model or tools or whatever but it's clear to me that the problem is that AI doesn't have any vision of where your code will need to organically head towards. It's great for one shots and rewrites, but it always always always chokes on larger/complicated projects, ESPECIALLY ones that are not written in common languages (like JavaScript) or common packages/patterns eventually, and then I have to go spelunking to find why things aren't working or why it can't generate code to do something I know is possible. It's almost always because the input for new code is my ask AND the poorly structured code, so the LLM will rarely clean up it's own crap as it goes. If anything, it keeps writing shoddy wrapper around shoddy wrappers.
Anyways, still helpful for writing boilerplate and segments of code, but I like to know what is happening and have control over how my code is structured. I can't trust the LLMs right now.
I take breaks.
But I also get drawn to overworking ( as I'm doing right now ), which I justify because "I'm just keeping an eye on the agent".
It's hard work.
It's hard to explain what's hard about it.
Watching as a machine does in an hour what would take me a week.
But also watching to stop the machine spin around doing nothing for ages because it's got itself in a mess.
Watching for when it gets lazy, and starts writing injectable SQL.
Watching for when it gets lazy, and tries to pull in packages it had no right to.
We've built a motor that can generate 1,000 horse power.
But one man could steer a horse.
The motor right now doesn't have the appropriate steering apparatus.
I feel like I'm chasing it around trying to keep it pointed forward.
It's still astronomically productive.
To abandon it would be a waste.
But it's so tiring.
This statement resonates with me. Vibe coding gets the job done quickly, but without the same joy. I used to think that it was the finished product that I liked to create, but maybe it's the creative process of building. It's like LEGO kits, the fun is putting them together, not looking at the finished model.
On the flip side, coding sessions where I bang my head against the wall trying to figure out some black box were never enjoyable. Nor was writing POCOs, boilerplate, etc.
They ask a business question to the AI and it generates a bunch of code.
But honestly, coding isn't the part that slowed me down. Mapping the business requirements to code that doesn't fail is the hard part.
And the generated PRs are just answers to the narrow business questions. Now I need to spend time in walking it all back, and try to figure out what the actual business question is, and the overall impact. From experience I get very little answer to those questions.
And this is where Software Engineering experience becomes important. It's asking the right questions. Not just writing code.
Next to that I'm seeing developers drinking the cool-aid and submitting PRs where a whole bunch of changes are made, but they don't know why. Well, those changes DO have impact. Keeping it because the AI suggested it isn't the right answer. Keeping it because you agree with the AI's reasoning isn't the right answer either.
It's now 11:47am and I am mentally exhausted. I feel like my dog after she spends an hour at her sniff-training class (it wipes her out for the rest of the day.)
I've felt like that on days without the meetings too. Keeping up with AI tools requires a great deal of mental effort.
To people with little to no practical software experience, I can see why that seems incredible. Think of the savings! But to anyone who's worked in a legacy code base, even well written ones, should know the pain. This is worse. That legacy code base was at least written with intention, and is hopefully battle tested to some degree by the time you look at it. This is 20k lines of code written by an intern that you are now responsible for going through line by line, which is going to take at least as long as it would have to write yourself.
There are obvious wins from AI, and agents, but this type of development is a bad idea. Iteration loops need to be kept much smaller, and you should still be testing as you go like you would when writing everything yourself. Otherwise it's going to turn into an absolute nightmare fast.
Usually that requires saying something, seeing if the other person understands what I'm saying, and occasionally repeating myself in a different way.
It can be real tiring when I'm with friends who only speak the other language so we're both using translator tools and basically repeating that loop up to 2-3 hours.
I've found the same situation with vibe coding. Especially when the model misunderstands what I want or starts going off on a tangent. sometimes it's easier to edit the original query or an earlier step in the flow and re-write it for a better result.
Choosing, analysing, verifying, talking to others rather than thinking in a sequential and organised way.
One of the reasons I moved away of that position was this constant fatigue by the end of the day. I was happy on seeing things moving forward but it felt that it took more energy to do less
However, when it comes to my professional work on a mature, advanced project, I find it much easier to write the code myself than to provide a very precise specification without which the LLM wouldn't generate code of a sufficiently high quality.
This fine for WFH/remote work. It didn't have great optics when I went back to in-office for a bit.
If you're able to blaze through feature tickets using GenAI on existing projects of any major complexity, there's almost certainly something which would produce better code that you're skipping.
I have plenty of info and agents for Claude Code to take into account when I use it to make features in our projects, but what it can't know is the cadence of what our business partners expect, the unknown unknowns, conversations that humans have about the projects, and the way end-users feel about the project. My job is to direct it with those factors in mind, and the work to account for those factors takes time.
Maybe the fatigue comes from that mismatch?
The classical vibe coder style is to just ignore verification. That's not a good approach as well.
I think this space has not matured yet. We have old tools (test, lint) and some unreliable tools (agent assisted reviews), but nothing to match the speed of generation yet.
I do it by creating ad-hoc deterministic verifiers. Sometimes they'll last just a couple of PRs. It's cheap to do them now. But also, there must be a better way.