I’ve worked with plenty of developers who are happy to slam null checks everywhere to solve NREs with no thought to why the object is null, should it even be null here, etc. There’s just a vibe that the null check works and solves the problem at hand.
I actually think a few folks like this can be valuable around the edges of software but whole systems built like this are a nightmare to work on. IMO AI vibe coding is an accelerant on this style of not knowing why something works but seeing what you want on the screen.
Many folks would say that if shipping faster allows for faster iterations across an idea then the silly errors are worth it. I’ve certainly seen a sharp increase on execs calling BS on dev teams saying they need months to develop some basic thing.
It took me about an hour of debugging to realize that copilot swapped a variable prefixed with "x" and a different one prefixed with "y".
If I wrote it myself, I wouldn't have made that mistake. But, otherwise copilot wrote the exact code that I was going to write.
I'd like to say that AI just takes this to an extreme but I'm not even sure about that, I think it could produce more code and more bugs than I could in the same amount of time but not significantly so if I just gave up on caring about anything
To give an example of how to use AI successfully check the following post:
https://friendlybit.com/python/writing-justhtml-with-coding-...
But, assuming there are not bugs and the code ships. Has there been any study in resource usage creeping up and an impact of this on a whole system. The tests I have done with trying to build things with AI it always seems like there is zero efficiency unless you notice it and can put it in the right direction.
I have been curious about the impact this will have on general computing as more low quality code makes it into applications we use every day.
Like for example: found missing data validation / sanitization reported, only because the code has already been sanitized / validated but this is not visible in the diff.
You can tell CodeRabbit he is wrong about this and the tool accepts it then, though.
This does not sound like a sufficiently large dataset
Is it: AI builds product faster but with more bugs in production (Adds overall time to acceptable production)
or
Is is: Al helps us build faster, enough though we have to fix more bugs before production. (over all less time to acceptable production, but specifically fixing bugs takes longer)
No one cares what a compiler or js minifier names its variables in its output.
Yes, if you don't believe we will get there ever, then this is totally valid complaint. You are also wrong about the future.
You have to be careful by exactly telling the LLM what to test for and manually check the whole suite of tests. But overall it makese feel way more confident over increasing amounts of generated code. This of course decreases the productivity gains but is necessary in my opinion.
And linters help.
Coding with agents has forced me to generate more tests than we do in most startups, think through more things than we get the time to do in most startups, create more granular tasks and maintain CI/CD (my pipelines are failing and I need to fix them urgently).
These are all good things.
I have started thinking through my patterns to generate unit tests. I was generating mostly integration or end to end tests before. I started using helping functions in API handlers and have unit tests for helpers, bypassing the API level arguments (so not API mocking or framework test to deal with). I started breaking tasks down into smaller units, so I can pass on to a cheaper model.
There are a few patterns in my prompts but nothing that feels out of place. I do not use agents files and no MCPs. All sources here: https://github.com/brainless/nocodo (the product is itself going through a pivot so there is that).
There could be massive differences in quality between LLMs.
It's almost like companies that want to sell you shit are only interested in the parts of the story that benefit them.
I suppose the tl;dr is if you're generating bugs in your flow and they make it to prod, it's not a tool problem - it's a cultural one.
1. "Write a couple lines or a function that is pretty much what four years ago I would have gone to npm to solve" (e.g. "find the md5 hash of this blob")
2. "Write a function that is highly represented and sampleable in the rest of the project" (e.g. "write a function to query all posts in the database by author_id" (which might include app-specific steps like typing it into a data model)).
3. "Make this isolated needle-in-a-haystack change" (e.g. "change the text of such-and-such tooltip to XYZ") (e.g. "there's a bug with uploading files where we aren't writing the size of the file to the database, fix that")
I've found that it can definitely do wider-ranging tasks than that (e.g. implement all API routes for this new data type per this description of the resource type and desired routes); and it can absolutely work. But, the two problems I run into:
1. Because I don't necessarily have a grokable handle on what it generated, I don't have a sense of what its missing and needed follow-on prompts to create. E.g.: I tell it to write an endpoint that allows users to upload files. A few days later, we realize we aren't MD5-hashing the files that got uploaded; there was a field in the database & resource type to store this value, but it didn't pick up on that, and I didn't prompt it to do this; so its not unreasonable. But oftentimes when I'm writing routes by hand, I'm spending so much time in that function body that follow-on requirements naturally occur to me ("Oh that's right, we talked about needing this route available to both of these two permissions, crap let me implement that"). With AI, it finishes so fast that my brain doesn't have time to remember all the requirements.
2. We've tried to mitigate this by pushing more development into the specs and requirements up-front. This is really hard to get humans to do, first of all. But more critically: None of our data supports the hypothesis that this has shortened cycle times. It mostly just trades writing typescript for reading & writing English (which few engineers I've ever worked with are actually all that good at). The engineers still end up needing long cycle times back-and-forth with the AI to get correct results, and long cycle times in review.
3. The more code you ask it to generate, the more vibeslop you get. Deeply-nested try/catch statements with multiple levels of error handling & throwing. Comments everywhere. Reimplementing the same helper functions five times. These things, we have found, raise the cost and lower the reliability & performance of future prompting, and quickly morph parts of the system into a no-man's-land (literally) where only AIs can really make any change; and every change even by the AIs get harder and harder to ship. Our reported customer issues on these parts of the app are significantly higher than others, and our ability to triage these issues is also impacted because we no longer have SMEs that can just brain-triage issues in our CS channels; everything now requires a full engineering cycle, with AI involvement, to solve.
Our engineers run the spectrum of "never wanted to touch AI, never did" to "earnestly trying to make it work". Ultimately I think the consensus position is: Its a tool that is nice to have in the toolbox, but any assertion that its going to fundamentally change the profile of work our engineers do, or even seriously impact hiring over the long-term, is outside the realm of foreseeable possibility. The models and surrounding tooling are not improving fast enough.
The article says problems, not bug. Problems seems to include formatting or naming issues. I couldn't find 1.7x bugs specifically. I only see 3 mentions of bugs, with no number attached.
This is ... not actually faster.
I think I've seen so much rush-shipped slop (before and after) that I'm really anxiously waiting for this bubble to pop.
I have yet to be convinced that AI tooling can provide more than 20% or so speedup for an expert developer working in a modern stack/language.
The real kicker is when someone copies AI generated code without understanding it and then 3 months later nobody can figure out why production keeps having these random issues. debugging AI slop is its own special hell
LLMs in coding feel similar. They don’t magically remove the need for tests, specs, and review; they just compress the time between “idea” and “running code” so much that all the missing process shows up as outages instead of slow PRs. The risk isn’t “AI writes code”, it’s orgs refusing to slow down long enough to build the equivalent of traffic lights and driver’s ed around it.
All websites involved are vibe coded garbage that use 100% CPU in Firefox.