What are you building? Does the tool help or hurt?
People answered this wrong in the Ruby era, they answered it wrong in the PHP era, they answered it wrong in the Lotus Notes and Visual BASIC era.
After five or six cycles it does become a bit fatiguing. Use the tool sanely. Work at a pace where your understanding of what you are building does not exceed the reality of the mess you and your team are actually building if budgets allow.
This seldom happens, even in solo hobby projects once you cost everything in.
It's not about agile or waterfall or "functional" or abstracting your dependencies via Podman or Docker or VMware or whatever that nix crap is. Or using an agent to catch the bugs in the agent that's talking to an LLM you have next to no control over that's deleting your production database while you slept, then asking it to make illustrations for the postmortem blog post you ask it to write that you think elevates your status in the community but probably doesn't.
I'm not even sure building software is an engineering discipline at this point. Maybe it never was.
Once the codebase has become fully agentic, i.e., only agents fundamentally understand it and can modify it, the prices will start rising. After all, these loss making AI companies will eventually need to recoup on their investments.
Sure it will be - perhaps - possible to interchange the underlying AI for the development of the codebase but will they be significantly cheaper? Of course, the invisible hand of the market will solve that problem. Something that OPEC has successfully done for the oil market.
Another issue here is once the codebase is agentic and the price for developers falls sufficiently that it will significant cheaper to hire humans again, will these be able to understand the agentic codebase? Is this a one-way transition?
I'm sure the pro-AIs will explain that technology will only get cheaper and better and that fundamentally it ain't an issue. Just like oil prices and the global economy, fundamentally everything is getting better.
A service goes down. He tells the agent to debug it and fix it. The agent pulls some logs from $CLOUDPROVIDER, inspects the logs, produces a fix and then automatically updates a shared document with the postmortem.
This got me thinking that it's very hard to internalize both issue and solution -updating your model of the system involved- because there is not enough friction for you to spend time dealing with the problem (coming up with hypotheses, modifying the code, writing the doc). I thought about my very human limitation of having to write things down in paper so that I can better recall them.
Then I recalled something I read years ago: "Cars have brakes so they can go fast."
Even assuming it is now feasible to produce thousands of lines of quality code, there is a limitation on how much a human can absorb and internalize about the changes introduced to a system. This is why we will need brakes -- so we can go faster.
One thing about the old days of DOS and original MacOS: you couldn't get away with nearly as much of this. The whole computer would crash hard and need to be rebooted, all unsaved work lost. You also could not easily push out an update or patch --- stuff had to work out of the box.
Modern OSes with virtual memory and multitasking and user isolation are a lot more tolerant of shit code, so we are getting more of it.
Not that I want to go back to DOS but Wordperfect 5.1 was pretty damn rock solid as I recall.
The other, arguably far more important output, is the programmer.
The mental model that you, the programmer, build by writing the program.
And -- here's the million dollar question -- can we get away with removing our hands from the equation? You may know that knowledge lives deeper than "thought-level" -- much of it lives in muscle memory. You can't glance at a paragraph of a textbook, say "yeah that makes sense" and expect to do well on the exam. You need to be able to produce it.
(Many of you will remember the experience of having forgotten a phone number, i.e. not being able to speak or write it, but finding that you are able to punch it into the dialpad, because the muscle memory was still there!)
The recent trend is to increase the output called programs, but decrease the output called programmers. That doesn't exactly bode well.
See also: Preventing the Collapse of Civilization / Jonathan Blow (Thekla, Inc)
As somebody who has been running systems like these for two decades: the software has not changed. What's changed is that before, nobody trusted anything, so a human had to manually do everything. That slowed down the process, which made flaws happen less frequently. But it was all still crap. Just very slow moving crap, with more manual testing and visual validation. Still plenty of failures, but it doesn't feel like it fails a lot of they're spaced far apart on the status page. The "uptime" is time-driven, not bugs-per-lines-of-code driven.
DevOps' purpose is to teach you that you can move quickly without breaking stuff, but it requires a particular way of working, that emphasizes building trust. You can't just ship random stuff 100x faster and assume it will work. This is what the "move fast and break stuff" people learned the hard way years ago.
And breaking stuff isn't inherently bad - if you learn from your mistakes and make the system better afterward. The problem is, that's extra work that people don't want to do. If you don't have an adult in the room forcing people to improve, you get the disasters of the past month. An example: Google SREs give teams error budgets; the SREs are acting as the adult in the room, forcing the team to stop shipping and fix their quality issues.
One way to deal with this in DevOps/Lean/TPS is the Andon cord. Famously a cord introduced at Toyota that allows any assembly worker to stop the production line until a problem is identified and a fix worked on (not just the immediate defect, but the root cause). This is insane to most business people because nobody wants to stop everything to fix one problem, they want to quickly patch it up and keep working, or ignore it and fix it later. But as Ford/GM found out, that just leads to a mountain of backlogged problems that makes everything worse. Toyota discovered that if you take the long, painful time to fix it immediately, that has the opposite effect, creating more and more efficiency, better quality, fewer defects, and faster shipping. The difference is cultural.
This is real DevOps. If you want your AI work to be both high quality and fast, I recommend following its suggestions. Keep in mind, none of this is a technical issue; it's a business process isssue.
Product design has a slightly different problem than engineering, because the speed of development is so high we cannot dogfood and play with new product decisions, features. By the time I’ve realized we made a stupid design choice and it doesn’t really work in real world, we already built 4 features on top of it. Everyone makes bad product decisions but it was easy and natural to back out of them.
It’s all about how we utilize these things, if we focus on sheer speed it just doesn’t work. You need own architecture and product decisions. You need to use and test your products with humans (and automate those as regression testing). You need to able to hold all of the product or architecture in your mind and help agents to make the right decisions with all the best practice you’ve learned.
I use Aider on my private computers and Copilot at work. Both feel equally powerful when configured with a decent frontier model. Are they really generations apart? What am I missing?
If there are any common apps which are unhinged please do share your experiences. LinkedIn was never great quality but it's off the charts. Also catching some on Spotify.
But in many agent-skeptical pieces, I keep seeing this specific sentiment that “agent-written code is not production-ready,” and that just feels… wrong!
It’s just completely insane to me to look at the output of Claude code or Codex with frontier models and say “no, nothing that comes out of this can go straight to prod — I need to review every line.”
Yes, there are still issues, and yes, keeping mental context of your codebase’s architecture is critical, but I’m sorry, it just feels borderline archaic to pretend we’re gonna live in a world where these agents have to have a human poring over every single line they commit.
Did I miss something? I haven't used it in a minute, but why is the author claiming that it's "uninstallable malware"?
I think a lot of this is just Typescript developers. I bet if you removed them from the equation most of the problem he's writing about go away. Typescript developers didn't even understand what React was doing without agent, now they are just one-shot prompting features, web apps, clis, desktop apps and spitting it out to the world.
The prime example of this is literally Anthropic. They are pumping out features, apps, clis and EVERY single one of them release broken.
Reminds me of Carson Gross' very thoughtful post on AI also: https://htmx.org/essays/yes-and/
[Y]ou are going to fall into The Sorcerer’s Apprentice Trap, creating systems you don’t understand and can’t control.
This is a great point.
I have been avoiding LLM's for awhile now, but realized that I might want to try working on a small PDF book to Markdown conversion project[0]. I like the Claude code because command line. I'm realizing you really need to architect with good very precise language to avoid mistakes.
I didn't try to have a prompt do everything at once. I prompted Claude Code to do the conversion process section by section of the document. That seemed to reduce the mistake the agent would make
[0]: https://www.scottrlarson.com/publications/publication-my-fir...
https://gist.github.com/ontouchstart/d43591213e0d3087369298f...
(Note: pi was written by the author of the post.)
Now it is time to read them carefully without AI.
My gut says something simple is missing that makes all of the difference.
One thought I had was that our problem lives between all the things taking something in and spitting something out. Perhaps 90% of the work writing a "function" should be to formally register it as taking in data type foo 1.54.32 and bar 4.5.2 then returning baz 42.0 The register will then tell you all the things you can make from baz 42.0 and the other data you have. A comment(?) above the function has a checksum that prevents anyone from changing it.
But perhaps the solution is something entirely different. Maybe we just need a good set of opcodes and have abstractions represent small groups of instructions that can be combined into larger groups until you have decent higher languages. With the only difference being that one can read what the abstraction actually does. The compiler can figure lots of things out but it wont do architecture.
I think this is very good take on AI adoption: https://mitchellh.com/writing/my-ai-adoption-journey. I've had tremendous success with roughly following the ideas there.
> The point is: let the agent do the boring stuff, the stuff that won't teach you anything new, or try out different things you'd otherwise not have time for. Then you evaluate what it came up with, take the ideas that are actually reasonable and correct, and finalize the implementation.
That's partially true. I've also had instances where I could have very well done a simple change by myself, but by running it through an agent first I became aware of complexities I wasn't considering and I gained documentation updates for free.
Oh and the best part, if in three months I'm asked to compile a list of things I did, I can just look at my session history, cross with my development history on my repositories and paint a very good picture of what I've achieved. I can even rebuild the decision process with designing the solution.
It's always a win to run things through an agent.
I don't agree, but bigger issue to me is many/most companies don't even know what they want or think about what the purpose is. So whereas in past devs coding something gave some throttle or sanity checks, now we'd just throw shit over wall even faster.
I'm seeing some LinkedIn lunatics brag about "my idea to production in an hour" and all I can think is: that is probably a terrible feature. No one I've worked with is that good or visionary where that speed even matters.
Companies will face the maintenance and availability consequences of these tools but it may take a while for the feedback loop to close
That may be the case where AI leaks into, but not every software developer uses or depends on AI. So not all software has become more brittle.
Personally I try to avoid any contact with software developers using AI. This may not be possible, but I don't want to waste my own time "interacting" with people who aren't really the ones writing code anymore.
Oh they even swore in the title.
Oh and of course it's anti-economics and is probably going to hurt whoever actually follows it.
Three for three. It's not logical it's emotional.
Integration is the key to the agents. Individual usages don't help AI much because it is confined within the domain of that individual.