FRESH

Hacker News

Home

The Eternal Sloptember

476 points by razin

by cafkafk

11 subcomments

I think a lot of the problem with the current discourse is how black-and-white it is. Either you're a luddite or "ai pilled".
In most cases, LLMs can get you 80-95% of the way, sometimes less, sometimes more. And heck, sometimes, it just gets you somewhere wrong.
But it seems everyone is arguing about whether LLMs can be perfect software engineers in isolation running in a closet, and using that to say that LLMs do not have a massive potential in other scenarios.
Sometimes, I like to imagine how much more productive most organizations could be from the things that the internet gave us, even to this day. Most companies never really do even a fraction of what is possible. That helps to ground my view of LLMs as well.
The fault dear Brutus isn't in our language models, but in ourselves.

by Nition

3 subcomments

With the level of ability that AI is at right now, I've found it useful personally to think of it something like a very good search over existing knowledge. Another step up in searchability in the lineage of reference books, stack overflow, GitHub etc.
Programmers are rewriting and reinventing the same techniques more often than any other vocation I can think of, and so we were primed for a really good search over prior art. The fact that AI can also adapt that prior art to your particular use case makes it even more powerful.
Much like how great success never came from cobbling together various bits of copy-pasted code from Stack Overflow though, current AI can't really build your whole project.

by SCdF

1 subcomments

So currently there are people who are buying grey market peptides[1], marked "not for human consumption" and injecting themselves with them based on dubious anecdotes and vibes, to make their skin clearer, build muscle mass, and so on.
Are they are all suddenly turning into zombies? No. Do they have any real idea what that is going to do to their body a few years down the line? Also no. Could it be catastrophic? Maybe!
I think about this when I think about how violently much of the industry has pivoted into AI being the primary generator of code in the last 6ish months. AI is the peptide, your codebase[2] is the body. Literally no one knows how maintainable this approach is, because there simply hasn't been enough time to find out. It could be fine. It could be a complete mess, with your entire engineering team falling asleep at the wheel, lulled into thinking they understand what is being built when they don't, completely impotent to fix or maintain it once the LLM is no longer able to.
[1] https://www.bbc.co.uk/news/articles/cdr268m5pxro
[2] Well, _their_ codebase. I've stopped doing it with my own personal codebases, unless I genuinely don't care about maintainability or longevity

by Balinares

0 subcomment

One under-discussed phenomenon here, I think:
The hardest thing in software engineering is solving the right problem. The ability to identify the right problem to solve, is IMO, what distinguishes the top senior engineers. And we could have endless discussions about what constitutes the right problem, but for the sake of this discussion, let's reduce it to: the problem whose resolution adds the most value to the product for the amount of complexity and afferent costs that it incurs.
Once upon a time, long ago, I worked on a Web product whose original junior designer had figured it would be neat to be able to manage the backend with LDAP tools. So the database schema and structure that the product used mimicked that of OpenLDAP, with compound CN keys, and the entire codebase had to deal with that structure whenever reading from or writing to the DB. LDAP compatibility was not the right problem to solve when designing the DB schema.
But software that solves the right problems can be hard to identify because, quite often, how it does things seems so obvious that it's not readily apparent what other designs might have been chosen.
Now, the thing that usually keeps the blast radius of wrong-problem designs limited over time, is the very friction that they introduce. Development slows down, including the development of more wrong-problem designs. It's a self-limiting phenomenon.
And that's one major thing which worries me about LLM coding agents:
They paper over this friction. They don't repair it; they just make it so its cost is deferred.
So you gradually end up with codebases that grow unboundedly complex for the value they provide, with no controlling mechanisms.
You end up with juniors who never face the feedback loop from which they'd develop the engineering instincts and the taste for what makes a problem the right problem to solve in a given design.
At scale, as a field, you might end up forgetting there ever was such a thing as solving the right problem.
And I don't know what to do about that. Plan for an early retirement, maybe.

by thedjpetersen

4 subcomments

Part of my job is working on trying to make these models productive for the large corporation I work for. It's a lot of throwing tomatoes at a wall and to a degree I see the issue he is talking about output seemingly having a certain ceiling.
At the same time in no part of his post is any code snippet or anything to latch on to of "the model performed poorly here when it should have done this" - this style of criticism seems to be a pattern of most of these "the LLMs will never work" style posts on blogs and twitter.
They obviously can perform better than autocomplete and in my own day to day development build out huge portions of a codebase that I would have expected a junior or midlevel engineer to perform at.
How are we really supposed to grasp their actual capabilities when no one will actually cite specifically what mistakes they are making.

by farhanhubble

10 subcomments

I'm in the "haven't written any code in a while" boat ATM. I'd love to see examples of issues that are so big that they warrant reverting to manual coding.
My main issue has been the inconsistent quality across between model releases and the tendency to insert older APIs or documentation, especially with command line tools.
I can understand if the model struggles with a million line monolithic codebase with a decade of cruft but can't think of why it'd be too much of a pain with new codebases.

by pickleRick243

2 subcomments

Agent harnesses have barely been available for a year, and reasonably reliable for only half, and there's already fatigue. I think this says less about whether LLM's will actually be able to program and more about how mentally exhausting AI-assisted programming can be, which involves a higher frequency of decision making and reading an astronomical amount of both code and prose if you actually want to stay on top of what the agent is doing to your codebase. This personal/psychological exhaustion and negative sentiment is now being inaccurately transferred into a pessimistic prognosis for the advancement of the technology itself.

by mountainriver

5 subcomments

My guess is the models just continue to get better and better
When I got into agentic coding a year or two ago I was sure it was only good at autocomplete. Something happened earlier this year where the models hit a new level of capability.
Everyone I know now just does agentic coding, and it’s really amazing. I think we should just try pushing this as far as we can possibly go, it really feels like the acceleration of the human race is upon us.

by tptacek

0 subcomment

AFL didn't find more vulnerabilities than LLMs. AFL and skilled practitioners found vulnerabilities. AFL triggers faults, many (most?) of which aren't exploitable, and humans (or, now, agents) have to triage and evaluate them. And they did so in a pre-AFL corpus of memory-unsafe software. The heyday of AFL was a decade ago. Every target is harder now.

by decimalenough

0 subcomment

For context: the author is George "geohot" Hotz, who has a long list of exploits, likely the best known of which is basically vibe coding (I mean that in the nicest possible way) comma.ai for autonomous cars on a shoestring budget long before actual AI vibe coding was a thing.
https://en.wikipedia.org/wiki/George_Hotz

by WeaselsWin

0 subcomment

A pattern i'm seeing is that people working with low level stuff (USB <-> PCIe chip reverse engineering) are much more often refusing the pill. My career is essentially cooking an unholy amount of bog standard nodejs CRUD. And i think i definitely represent a bigger segment than low level gurus, and for me AI is irreplaceable.

by c0rruptbytes

0 subcomment

ai agents can program, in fact, in our current time with current models, i'd say they program better than most people in the industry (an industry where people were literally copying and pasting from stack overflow for years prior)
being able to program is not the only skill required to be a successful software engineer, so no ai agents cannot be software engineers
very important distinction - i personally like the radiologist example - looking at scans is a part of a radiologist job, AI can do it better than most of them, but looking at scans is a small part of the job, most of it communicating with doctors to help their patients

by sandruso

0 subcomment

Not reviewing outputs, which is my main issue, is one-way to subpar experience. No amount of "make it right" will fix that.
I hope that professionalism still matters as these new ways of doing things strikes me as unprofessional as f...
Yeah, the next macOS will be worse... time to place bet on prediction market

by nilirl

4 subcomments

I agree that I can write better code than an agent.
But it can write working code much faster than I can.
And in a lot of cases, unfortunately, faster beats better.

by petterroea

1 subcomments

I think geohot is somewhat of a clown but I think he is speaking reason here and I'm happy to see voices address this. Most seniors I work with agree.

by linsomniac

1 subcomments

>But each time I suspected I could have done it better and faster manually.
I've heard this said so many times, but my experience has just been so dramatically the opposite that it rings false. But geohot seems to be a pretty productive and smart guy, so it's hard to just dismiss what he's saying.
I get the sense that he's truly one of the 10x engineers. And maybe he can do it faster and better manually. But for those of us who aren't 10x, I think it lets us bridge that gap. Now we're getting back to "status anxiety": is this an attack on his ego, if the average becomes 10x?
Anecdote: Over 2 weeks of spare time, I used AI tooling to build a fairly sophisticated debian package caching proxy server (~72KLOC, 27K implementation, 45K tests). This would have easily taken me 6 months of focused time to implement by hand. I literally couldn't have done it because I can't take that much time off work and I have other weekend/evening obligations.

by bmenrigh

0 subcomment

Every C program I've had codex write ended up costing me more time than had I just done it from the start myself. Whereas almost every Python program it's written for me saved me time, even including the time I spent cleaning it up.
I chalk this up to primary two reasons. First, I cared a lot more about the implementation details of the C program than I did the Python one, and second, it's just better at simple stand-alone python programs than it is at C programs.
The criteria I know use is "do I care about the implementation details of this?". If I do (because for example it's going to be long-term code that I need to maintain) then the agent likely isn't worth it. But if I don't, there are huge efficiency gains to be had using the agent.

by luodaint

0 subcomment

Data from six months of production from one SaaS codebase provides a more limited response. Maintainability doesn't depend on the level of AI usage. Maintainability depends on the discipline during diff reviews. Good sessions: One topic per session; scope defined prior to the agent starting; all diffs read prior to committing. Poor sessions: Broad scope; undefined constraints; rubber-stamped results.
The quality of the codebase decays precisely at the rate you stop reading the results. This is not an issue of AI writing the code. This is an issue of unreviewed code. geohot's issue is entirely valid. This problem does exist. But this isn't dependent on the generation phase.

by baq

1 subcomments

Wonder if LLMs in autoreasearch loops would be able to complete tasks geohot has in mind in say 100x average token budget.
If the answer is yes, the argument doesn’t matter: you just run the loop and wait for llm analog of moore’s law to get costs down.

by dalemhurley

0 subcomment

When I started coding with AI I would copy / paste code into GPT-3.5 and ask it to update the code, it was a massive productivity boost, minor changes, fully reviewed. Then VSCode allowed tabbing, it was okayish, but I had my finger on the pulse and knew exactly what was changed and had an opinion on the suggestions. Then cursor allowed you to see and approve changes, after a few changes it started making bigger changes but had an review and approve process, things were starting to feel more magic and required more discipline to be on top of changes. Then YOLO mode hit and you could make massive changes, slowly it became easier and easier to just let the AI build code and you just guide it.
The issue is people mix up complexity, novelty, repeatability and scale.
Well documented complex problems can easily be solved by LLMs.
Doing the same thing over and over again is easy for an LLM.
Novelty and scale is very hard for an LLM.
Even small novel problems confuse LLMs.
When you start a new code base the LLM smashes through the boilerplate work. Then when it gets to scale it struggles with context rot plus novelty.

by Erenay09

0 subcomment

While I was reading this post, Anthropic sent me an email with the subject line "Your account has been suspended". What a coincidence :D

by athrowaway3z

2 subcomments

This post hits the nail at a bit of an angle.
The AI agents are great, and any expert can prompt them correctly to get good code. LLMs occasionally pick wrong patterns and start digging a hole, but this is why an expert is required. The code itself is just not worth writing when a detailed prompt can get you the same code typing 20x less text.
Where I agree with the post is:
The adoption of AI agents into software engineering is a problem. Solo projects are great, but our teams have not adjusted to the speed-of-change to a mental model of a project. So I see orgs making a choice to either: slow down or forgo the shared mental model.
Anybody choosing to forgo the mental model is building crooked legacy slop at scale. You can and should save the mental model to an AGENTS.md, but devs need it in their brain to prevent the digging a hole behavior.
To be fair the digging a hole behavior is something humans do just as well. But in teams you'd communicate enough to catch it - hopefully^1. It's the combination of higher speeds and teams that's creating a bit of a disaster.
I'm not sure what a good solution is either. There is a case for solo devs running for 2-month sprints with much more freedom. Perhaps we'll have an "AI Agile manifesto" within a year.
[1] Though you should not underestimate the amount of poor code being created before LLMs. There are enough teams for whom LLMs are practically all upsides. Stay very far away from those.

by totetsu

0 subcomment

>When people see an artifact, they make assumptions about the process that was used to create it. Without even thinking about it, they assume the creator had a basically human state of mind. This assumption is no longer true. Things can be broken in ways that weren’t previously possible, and old proxies of underlying quality like syntax and grammar are useless. AI produced artifacts are not produced by the same process as human ones, and this difference, while extremely subtle in statistics, makes itself obvious when you try to interact with and build on the artifact in human ways.
Once Humans just had oral language, and we could us words to pass ideas from one human mind to another. Then with writing ideas could pass to minds that weren't immediately close together in space or time.. and with this we made complext global spanning civilization. When words just become noise, that one has to be suspect of each one as to whither they'er coming from another human mind, or just a statistical process, can this civilization even survive?

by teo_zero

1 subcomments

When digital cameras replaced traditional ones, we thought it would make photography more democratic: each of us would be Helmut Newton for 15 minutes. But it didn't give us the beautiful portraits and inspired lanscapes we expected, only millions of pictures of food.
How much will it take for AI agents to pass from distilling decades of collective wisdom to copying each other's worst mistakes?

by eadwu

0 subcomment

I do think that between luddite or "ai pilled" ai usage should be much more in favor of "ai pilled".
However, this isn't a plug to be using AI for coding everything, but a more general plug that AI should be integrated to a lot more things outside of the mainstay of chatbots.
There is a lot of merit to using AI to establish a new abstraction layer.

by hmontazeri

0 subcomment

When a blog like this goes completely black or white on a topic I get skeptical. Nothing in life is 0 or 1. So is AI. Has some good to it and some obvious issues. All not that big of a deal. Ppl try to position themselves on the edges bc that’s what’s polarizes and engages conversation…

by jhanschoo

1 subcomments

I don't think Geohot has a good idea about LeCun and Hutter's views on the limitations of LLMs. I think that on abstract, textual domains, LLMs perform superbly, and they would agree. I am not too well-informed about LeCun and Hutter's views either, but I think that:
LeCun thinks that LLMs are a bad fit for AI that understands the physical, dynamical systems that we inhabit, and that understanding this is necessary for AGI/ASI.
I don't know that Hutter is bearish on LLMs, but Hutter is interested in AI that can reason exceptionally well given infinite compute, and approximations of such a reasoning AI. I think he is open to the idea that LLMs can be such an approximation.

by fluxusars

1 subcomments

I think the ai-as-an-exoskeleton analogy is quite apt: 1. It still requires your intent to move in the right direction rather than being fully autonomous, and 2. It's too big and bulky to use for delicate tasks.

by StefanSko

0 subcomment

Very much agree. All these currently hyped workflows removing the human from the loop attribute a modularity to those agents that just does not hold up. They will always be leaky abstractions given their stochastic nature. That being said, they can be a great tool for getting past the "blank page" and just start or getting unstuck in general.

by intended

2 subcomments

If nothing else, Eternal Sloptember is a term that seems obvious once you have it. I can’t believe this is the first time I’m seeing it.

by fagnerbrack

0 subcomment

"Things can be broken in ways that weren’t previously possible" and also "Things can work in ways that weren’t previously possible". It all depends on what the use the tool for, if you're a carpenter you're going to do a bad job regardless if you have a fancy hammer or a basic one. If you're an expert, give them a basic hammer and they'll do the work, give them a fancy hammer and they'll do the same, perhaps a little faster (or not).

by anabis

0 subcomment

Yet the Eternal September is what made the modern AI possible. I asked Cowork for share of the corpus before / after the event, and the corpus before it is <1%, which fits my hunch.

by webprofusion

0 subcomment

I don't think you can go completely hands-off for quality products but you can relax and let the agent do as much as possible. It does enable things that probably wouldn't have happened otherwise.
If you are already comfortable with letting other devs work on features then it's easier, because it's similar (arguably you have more control with AI, because what you say goes regardless of hierarchy).

by m132

0 subcomment

Didn't expect this to come from him. Seeing some of his recent YouTube streams and previous blog posts, he seemed like he has unconditionally bought into the idea of vibecoding, even as he had Opus 4.5 (latest at the time) stuck failing to enumerate a serial device for solid hours. What a turn.

by sgarrity

0 subcomment

"When people see an artifact, they make assumptions about the process that was used to create it. Without even thinking about it, they assume the creator had a basically human state of mind. This assumption is no longer true."
I've been running into this experience with non-code artifacts, like slideshows and documents.

by albinn

1 subcomments

> It’s definitely a better Google for most searches
I can't agree with this. You tend to get one point of view, often without any actual resources and references so you have to go look it up yourself, on [insert search engine]. Plus, what does it say when we consider an AI the one stop for our data intakes.

by jrvarela56

0 subcomment

how do you measure if google’s engineering org is more productive than meta’s? What about comparing 2 startups/small teams?
I think the discussion about methods (coding agents included) depends on answering those questions. Seems pointless to claim these agents [dont] make you more productive.
Although, at a first glance, the productivity increase does seem like nothing I’ve seen before. Even more than the transition of making webapps in plain js -> jquery -> frameworks or going from something like Flask to using Rails.
Problem is this is not evidence based. I just feel prototyping has speed up 100x. So the number of iterations/attempts has gone up. Transforming specs into a test suite takes a fraction of the time. Dunno, feels weird not to be able to be overall more productive (do more with less time) if you have these new tools.

by gojomo

0 subcomment

Smart guy but whoever eventually actually fixes X search will probably use AI coding assistance to do it.

by p0w3n3d

1 subcomments

```
  But each time I suspected I could have done it better and faster manually
```
There is a class of tasks that can't be done faster manually, unless you're some sort of colour-smells-like-chicken-and-numbers-have-taste genius. And there is other class (my suspicion now is any non-standard task+framework) that are slower than using agents. So I can imagine you have excellent experience with some tasks like USB hacking and would do it faster than LLM. On the other hand for me, as a Java developer, hacking a USB is finally possible with LLM. Otherwise I'd need to stop-and-learn for some time, which I wouldn't, so either I'd by a more expensive hardware that fulfills my requirements, or put the USB reverse engineering project to my 100 acre todo list

by mehdix

0 subcomment

I have started to think of engineers who have entirely replaced their critical thought processes with AI agents, AI proxies.

by sometimelurker

1 subcomments

AI labs should put some incentives in their RL to make their models write shorter code so it's easier to check

by makerofthings

1 subcomments

I wish these posts that talk about non-human mistakes that agents make would post some examples. They would be interesting to see.

by protocolture

1 subcomments

Eh but statistical models are obviously useful, because statistically 99% of your codebase wont involve new idea invention. Tools that write all the boilerplate code used to have names and job titles.
I hate how both the for and against case for LLMs are just so bloody terrible at addressing these things.

by palla89

0 subcomment

I don't know why but I feel happy and relieved reading a piece of this written by Geohot himself.

by jreynar

0 subcomment

Another problem with perception of AI tools, for coding and other things, is that people often adopt a one-size-fits-all view. If Claude/Codex whatever can fix a bug in my tiny hobby project then it's going to revolutionize all software engineering. If it can write a haiku, then it the great American novel will be dead in a few years and the novelists will starve.
There aren't many truly general purpose tools so viewing things this way seems like either a fantasy or an over-reaction. And if nothing else the processes we use will have to change along with the tools.
It's the early days so we still have a lot to figure out but one of the most significant is which tools are appropriate for what sort of tasks. I've had good luck refactoring a small code base, building some small hobby projects and building features for our company's product. But, I've also dodged bullets doing greenfield development on some features where Claude (my default) has made what seemed like sound choices early on, and which I approved of, only to build something fragile or with unforseen consequences. I haven't quite figured out what distinguished those situations from the successful ones but I'm trying. But it's complicated by the fact that things are evolving quickly and yesterday's failure mode isn't the same as today's and, for that matter, yesterday's successes aren't guaranted to be repeatable today.

by zarzavat

1 subcomments

It really feels like a mass psychosis. I'm not an AI sceptic insofar as I fully expect to get replaced by some future AI system. But what we have now isn't it.
To use a Geohot-inspired analogy, what we have now is like the Google self-driving car of 2010. It works most of the time, yet sometimes fails in unpredictable ways. So you need a safety driver behind the wheel to constantly watch what it's doing (the code review).
A real AI agent would not need a safety driver. We don't have that but many people are basically saying "fuck it, I'm just going to set this car off on its own and see what happens". And sure if you're prototyping it's not dangerous. But for production systems that is dangerous.

by rakel_rakel

0 subcomment

> The bottom performers won’t have that self check. They are the ones producing 10x output with the agents. What do you think is happening to the average output of that organization?
Nailed it!
At my last place this was encouraged (by non-technical leadership driving the AI adoption policies, as well as setting salaries) and seen as a huge win.
The "step change in number of created PR's" was celebrated (cult-style), and by one of the (co) CEO's praised as a paradigm shift of the same magnitude as the personal computer. Meanwhile, I was stuck finding insta-reject level bugs in pull requests from people one-shotting 6000 line PR's "finally solving" long-standing issues from the backlog. Needless to say I left.

by practal

1 subcomments

On Saturday I thought I had vibe coded myself into a mess. I had implemented a new block type in my structured editor for Practal Zero (or rather let Codex do it), and suddenly the syntax highlighting broke in the whole document. Asking Codex to fix it didn't work. I was contemplating to restart the whole project on a basis that I actually fully understand, but that would set me back so much when the first reasonable prototype seemed so close. Instead, I took a walk.
See, the project actually has a well thought out structure that I design carefully, but more and more of it gets filled out by Codex. Codex is not smart enough to remember all the high-level design considerations, some of which had not been documented because I was just implicitly assuming them. So the fix was to use Codex to isolate the error, think about in terms of the high-level design, and fix the problem, which was partially an implementation problem, and partially a problem of the high-level design.
I fixed the high-level design with discussions with Codex, and documenting this, and then let Codex implement the fixes. The discussion took me more than an hour, the implementation was done in a few minutes.
This working style is similar to doing math: You have a high-level idea of what you are doing, and let that guide you, and Codex assumes the role of something that fills out all of the details you take for granted. Often it turns out your high-level idea had flaws, and this shows up in your code not working as expected. So you revise your high-level idea, refactor the code to reflect the modified high-level design, rinse and repeat.
Working this way is still really hard, but it allows me to do things I could not have done before. Getting your ideas validated (or refuted) in minutes instead of days is huge, and makes it possible to march through stuff that would have turned into a deadly swamp before, at least for me.
Now. Do I think that most corporate programmers will use Codex or CC in this way? I don't know, but I think probably not. So what will stop them going into the swamp until it swallows them, instead of backing up in time and marching around it?

by forgetfreeman

1 subcomments

"It’s definitely a better Google for most searches"
This is dangerously incorrect. AI summaries of search results consistently return incorrect information and grossly oversimplified and thus misleading summaries, neither of which are detectable unless one either has prior domain knowledge or spends time drilling into search results to validate the AI output.

by Chaosvex

1 subcomments

> and it’s taking longer and longer to realize that they can’t
For something to take "longer and longer" to realise, doesn't they imply that it's been realised at least once before or that there was an expected deadline for the realisation?
Okay, that's a nitpick.

by thecatapps

1 subcomments

Wake me up.... when sloptember ends.

by mattlondon

1 subcomments

It's just a tool. Use it well or use it badly - just the same as any. If you are generating slop using the tool, we'll then that is your own problem.
For me, the AI is essentially "faster hands" that can type what I am thinking way faster than I can do it. I tell it what I want, I give it the broad architecture and design patterns/types to use, and any specific test conditions, and let it write all of that usually by the time I have responded to a single email or chat message or two. Custom instructions etc build overtime to address model blind spots or my own personal taste so I don't have to repeat myself in every prompt for cross-cutting things.
Does it "one shot it"? Almost never - we go around the cycle a few times, treating it like pair programming a junior or intern by keeping a close eye on the broad direction and making sure it is acceptable - course-correcting where it matters, but cutting some slack where it doesn't. Sometimes I ask it why it picked a particular approach (that I wouldn't have necessarily) and it gives me a cogent explanation and we go with it, so I actually sometimes learn new things from it too which is great.
The other use case is just it's sheer capacity to research a codebase and hold everything in it's attention at once. It can comprehend unfamiliar code way faster and way more in-depth than I can. So if you are in an unfamiliar code base or a language or framework you are not that familiar with, it absolutely shines because it can just absorb all that info in seconds, and then you can just drill it with questions and what-abouts and how does it do this and what technique is used for that and that, what are the existing patterns and norms in this codebase when it comes to foo or bar? Etc etc
What I am not doing is deferring everything off to the AI unless it really doesn't matter (e.g. disposable one-off or prototype code). Same that I would not expect a junior or intern to make big architectural decisions when implementing something - you keep them on a fairly close leash and watch what they are up to.

by big-chungus4

2 subcomments

Why sloptember when it's may

by olalonde

0 subcomment

Prediction: the author will wish the Internet Archive didn't exist in a few years.

by 4b11b4

0 subcomment

It depends

by vasco

0 subcomment

> And whenever you need a quick prototype and don’t care about polish, it is absurdly fast. But is it a software engineer? Not close to the bar at any company I have worked at.
This line which he wrote, will override any quality gaps, because the cost to produce that shitty software will be lower than the cost to produce good software.

by bassiee

0 subcomment

My point is, before LLM's 90% of the code was already human made slop, now its just going toward computer generated slop instead.

by pipeline_peak

0 subcomment

The more specific your work is, the more these LLM’s seem to struggle.
If your work was previously googling stack overflow, it can be incredibly useful at working through that. Which let’s face it, that’s what a lot of us do.

by simianwords

2 subcomments

People misunderstand how AI is used in coding in normal work environments. New feature requirement comes - maybe you need a new service or some new classes. You need to do some research first.
You guide the AI with some prompts and give it some guidance on how to scenario-test it. It makes some classes, test methods. Maybe ~2000 lines and you do a quick verification, check if the overall idea looks okay. Ask it to fix a few design things and then merge it.
Its much easier than doing it yourself with all the boilerplate and understanding each esoteric language specific thing. Which library do I use for UDP communication in golang? The agent might have made a good assumption. These kind of things is where it speeds it up.

by fontain

4 subcomments

We all remember cryptocurrency. Everyone in tech proclaimed fiat was dead, every office buzzed with talk of every possible way that cryptocurrency could be used, billions of dollars flooded in to projects losing money hand over fist. The cynics reacted to the froth with outright rejection of the idea. And today… cryptocurrency exists, it has some use, but it didn’t take over the world, it didn’t kill fiat, it was useful in some areas and worthless in others. AI will be the same. The noisiest proponents will be over exaggerating. The most cynical cynics will be underestimating. The result will be somewhere in the middle. Success will not be predicated on adoption of the technology. We, nerds, are bad at predicting the impact of technology.

by biosubterranean

0 subcomment

preach it

by dainiusse

2 subcomments

The horse is better than a car!

by slashdave

0 subcomment

> It is a golden era for buckets and buckets of slop, and a dark age for gems of quality.
I mean, this has been the trend for decades really, before LLMs were a thing. The incentive is skewed toward quantity rather than quality. The new tools just add more fuel to the fire.
Code quality is also really lacking in much of the industry. The truth is, these LLM models, as limited as they are, program at a level above that of the median junior programmer.

by mthrowaway

1 subcomments

Bro claims to write good code. He got fired <4 weeks from twitter. AI is hyped but code isnt that bad.

0 subcomment

by cmrdporcupine

0 subcomment

"I don’t think models like this will ever be able to program,"
I don't get how anybody who has used the SOTA models in the last 3-4 months can write a sentence like this?
They most certainly can program. And usually better than 90% of my coworkers.
The question is really.. Can they engineer? By which I mean handle the duties of a software engineer working in a team, managing a large complex system, making reviewable pieces, forward progress in incremental steps, etc.
No, that part I'm definitely more skeptical about. That requires slave driving by the person in front of the prompt.
But this is a useful distinction to make. Because making overly pessimistic claims about the coding capabilities of the models makes me question the author's experiences with them.
I think agentic tools are toxic to team programming culture and engineering that produces reliable stable results. But I wouldn't for the life of me question their ability to write programs.

by readthenotes1

0 subcomment

There's a time and a place for assembly language programming. Of course, I knew someone who would say there's a time and a place for machine language programming (improved it by reprogramming a device by flipping the 17 switches on the front panel)

by DeathArrow

0 subcomment

On the other hand we see success stories such as antirez using agents to work on Redis and Deepseek v4 flash inference.

by DeathArrow

2 subcomments

To me this sounds like an old cobbler complaining that machines aren't producing good shoes if left unsupervised and that the old process of making shoes completely by hand is far superior.
So what he is telling us? That agents are not infaillable and they are not capable to one shot complex software and they do not produce perfect code?
We know what and the solution is to use agents for what they are good at and work around their limitations and we have a human in the loop.
>not some RLVR shit that comments out the failing test and tells you all the tests are now passing
That's what harnesses should be about: detect when the agent is misbehaving and force it to take the right approach.
This example in particular should be easy to solve if we generated the tests before coding and we have a workflow or state machine that doesn't allow the agent to disable tests and doesn't allow it to reach the next stage unless all tests are passing.

by spiderfarmer

1 subcomments

Coders underestimate the utility of AI in so many boring day to day tasks. If you freelance, that’s where the money is at, not in creating a startup that fills holes in AI offerings or in creating generic slop while hoping for ad money.

by geraldsterling

0 subcomment

[flagged]

by coalstartprob

0 subcomment

[dead]

by coalstartprob

0 subcomment

[dead]

by zenai666

0 subcomment

[flagged]

by --username

0 subcomment

[flagged]

by simianwords

2 subcomments

Nah this person is dead wrong. Lets come back in 2 years and check on it. I'm willing to make a reasonable bet on these terms: companies will go even more AI native, will use even more tokens and spend even more money.
EDIT: To people downvoting me, please come up with a reasonable bet and lets try to work it out.
EDIT 2: $500 bet paid to your account on whether LLM's are going to still be used productively or not. No one?
EDIT 3: Any bet that would express the author's argument in a way that can be disproven in the future

by coolThingsFirst

0 subcomment

Geohot's next venture will be writing a book titled "Fear & Trembling".

by wyager

7 subcomments

> They are a highly sophisticated statistical model designed to mimic the distribution of programming
Are we really still doing this?

by blobbers

2 subcomments

I don't think LeCun is saying they won't be able to program. I think he says we won't hit AGI. Programming does not require AGI; it's a pretty specific skill!
-- I think this article is COPE, if I'm being quite honest. I thought of putting cute analogies, like the C programmers saying the Python and Javascript programmers are not "hardcore" enough... but the truth should be obvious to anyone using LLMs effectively.
-- Current AI is a much better programmer than 100% of people and when directed by someone in that top 10%, it's a force majeur.

by SebastianSosa

0 subcomment

A problem that's impossible to detect is indistinguishable from it working. Hence it works. Hence it's not a problem.

by bluegatty

1 subcomments

" Agents cannot program, and it’s taking longer and longer to realize that they can’t. They are a highly sophisticated statistical model designed to mimic the distribution of programming"
In other words - they can program, and probably better than you.
I don't like being too critical but this is a really superficial post - as if either 'AI is a Software Engineer - or - It must be Fraud'
It's an extremely powerful tool that is very 'pattern oriented' and with guidance can absolutely write great code - and even across modules given the right basis.
It's also great at so many other tasks - finding bugs in big code bases, doing migrations etc.
It's not going to make very goo architectural decisions for you, and if you're doing anything novel you have to read most of the code ... but that's too be expected.