E.g. I'm a software architect and developer for many years. So I know already how to build software but I'm not familiar with every language or framework. AI enabled me to write other kind of software I never learned or had time for. E.g. I recently re-implemented an android widget that has not been updated for a decade by it's original author. Or I fixed a bug in a linux scanner driver. None of these I could have done properly (within an acceptable time frame) without AI. But also none of there I could have done properly without my knowledge and experience, even with AI.
Same for daily tasks at work. AI makes me faster here, but also makes me doing more. Implement tests for all edge cases? Sure, always, I saved the time before. More code reviews. More documentation. Better quality in the same (always limited) time.
Where AI fails us is when we build new software to improve the business related to solar energy production and sale. It fails us because the tasks are never really well defined. Or even if they are, sometimes developers or engineers come up with a better way to do the business process than what was planned for. AI can write the code, but it doesn't refuse to write the code without first being told why it wouldn't be a better idea to do X first. If we only did code-reviews then we would miss that step.
In a perfect organisation your BPM people would do this. In the world I live in there are virtually no BPM people, and those who know the processes are too busy to really deal with improving them. Hell... sometimes their processes are changed and they don't realize until their results are measurably better than they used to be. So I think it depends a lot on the situation. If you've got people breaking up processes, improving them and then decribing each little bit in decent detail. Then I think AI will work fine, otherwise it's probably not the best place to go full vibe.
This is the kind of argument that seems true on the surface, but isn't really. An LLM will do what you ask it to do! If you tell it to ask questions and poke holes into your requirements and not jump to code, it will do exactly that, and usually better than a human.
If you then ask it to refactor some code, identify redundancies, put this or that functionality into a reuseable library, it will also do that.
Those critiques of coding assistants are really critiques of "pure vibe coders" who don't know anything and just try to output yet another useless PDF parsing library before they move on to other things.
I wish we'd stop redefining this term. Technical debt is a shortcut agreed upon with the business to get something out now and fix later, and the fix will cost more than the original. It is entirely in line with business intent.
> When asked what would help most, two themes dominated
> Reducing ambiguity upstream so engineers aren’t blocked...
I do wonder how much LLMs would help here, this seems to me at least, to be a uniquely human problem. Humans (Managers, leads, owners, what have you) are the ones who interpret requirements, decide deadlines, features and scope cuts and are the ones liable for it.
What could an LLM do to reduce ambiguity upstream? If it was trained with information on requirements, this same information could be documented somewhere for engineers to refer to. If it were to hallucinate or "guess" an answer without talking to a person for clarification, and which might turn out to not be correct, who would be responsible for it? imo, the bureaucracy and waiting for clarification mid-implementation is a necessary evil. Clever engineers, through experience, might try implement things in an open way that can be easily changed for future changes they predict might happen.
As for the second point,
> A clearer picture of affected services and edge cases
> three categories stood out: state machine gaps (unhandled states caused by user interaction sequences), data flow gaps, and downstream service impacts.
I'd agree. Perhaps when a system is complex enough, and a developer is laser focused on a single component of it, it is easy to miss gaps when other parts of the system are used in conjunction with it. I remember a while ago, it used to be a popular take that LLMs were a useful tool for generating unit tests, because of their usual repetitive nature and because LLMs were usually good at finding edge cases to test, some of which a developer might have missed.
---
I will say, it is refreshing to see a take on coding assistants being used on other aspects instead of just writing code, which as the article pointed out, came with its own set of problems (increase Inefficiencies in other parts of the development lifecycle, potential AI-introduced security vulnerabilities, etc.)
And having a person that keeps right up with you makes it feel like they’re very intelligent, because of course they are, they seem like scarily as intelligent as you. Because they’re right next to you, maybe even a little ahead! (I think Travis Kalanick was experiencing this when he was talking about Vibe Physics.)
But the thing is, it was ultimately an extension of your ideas, without your prompts, the ideas don’t exist. It’s very library of babel esque.
And so I wonder if coding assistants have this general problem. If you’re a good developer following good practices, prompting informatively, it’s right next to you.
If you’re not so good and tend to not be able to express yourself clearly or develop solutions that are simple, it’s right there with you.
Anecdotally I see this _all the time_...
Everything else is just hype and people “holding it wrong”.
LLM has been hollowing out the mid and lower end of engineering. But has not eroded highest end. Otherwise all the LLM companies wouldn’t pay for talent, they’d just use their own LLM.
Are they something worth using up vast amounts of power and restructuring all of civilisation around? No
Are they worth giving more power to megacorps over? No
Its like tech doesn't understand consent and then partially the classic case of "disrupting x" - thinking that you know how to solve something in maths, cs, physics and then suddenly that means you can solve stuff in a completely different field.
llms are over indexed.
Most coding assistant tools are flexible to applying these kinds of workflows, and these sorts of workflows are even brought up in Anthropic's own examples on how to use Claude Code. Any experienced dev knows that the act of specifically writing code is a small part of creating a working program.
Either you (a) don't review the code, (b) invest more resources in review or (c) hope that AI assistance in the review process increases efficiency there enough to keep up with code production.
But if none of those work, all AI assistance does is bottleneck the process at review.
The datasets are big and having the scripts written in the performant language to process them saves non-trivial amounts of time, like waiting just 10 minutes versus an hour.
Initial code style in the scripts was rather ugly with a lot of repeated code. But with enough prompting that I reuse the generated code became sufficiently readable and reasonable to quickly check that it is indeed doing what was required and can be manually altered.
But prompting it to do non-trivial changes to existing code base was a time sink. It took too much time to explain/correct the output. And critically the prompts cannot be reused.
One paper is sure doing a lot of leg work these days...
Errrrr…. false.
I’ll stop reading right there thanks I think I know what’s coming.
Really? 2024? That was forever ago in LLM coding. Before tool calling, reasoning, and larger context windows.
It is like saying YouTube couldn’t exist because too many people were still on dial up.
tl;dr content marketing
There is this super interesting post in new about agent swarms and how the field is evolving towards formal verification like airlines, or how there are ideas we can draw on. Any, imo it should be on the front over this piece
"Why AI Swarms Cannot Build Architecture"
An analysis of the structural limitations preventing AI agent swarms from producing coherent software architecture