FRESH

Hacker News

Home

Codegen is not productivity

75 points by donutshop

by jwpapi

0 subcomment

I have to be honest. I’ve written a lot of pro-ai / dark-software articles and I think Im due an update, cause it worked great, till it didn’t.
I could write a lot about what I’ve tried and learnt, but so far this article is a very based view and matches my experience.
I definitely suffered under the unnecessary complexity and wished to never’ve used AI at moments and even with OPUS 4.6 I could feel how it was confused and couldn’t understand business objectives really. It became way faster to jump in code, clean it up and fix it myself. I’m not sure yet where and how the line is and where it will be.

by vinceguidry

0 subcomment

I recently started using AI for personal projects, and I find it works really well for 'spike' type tasks, where what you're trying to do is grow your knowledge about a particular domain. It's less good at discovering the correct way of doing things once you've decided on a path forward, but still more useful than combing through API docs and manpages yourself.
It might not actually deliver working things all that much faster than I could, but I don't feel mentally drained by the process either. I used to spend a lot of time reading architecture docs in order to understand available solutions, now I can usually get a sense for what I need to know just from asking ChatGPT how certain things might be done using X tool.
In the last few days, I've stood up syncthing, tailscale with a headscale control plane, and started making working indicators and strategies in PineScript, TradingView's automated trading platform. Things I had no energy for or would have been weeklong projects take hours or a day or so. AI's strengths synergize really well with how humans want to think.
I just paste an error message in, and ChatGPT figures out what I'm trying to do from context, then gives me not just a possible resolution, but also why the error is happening. The latter is just as useful as the former. It's wrong a lot, but it's easy to suss out.

by demorro

1 subcomments

A well considered article, despite the author categorizing it as a rant. I appreciate the appendix quotations, as well as the acknowledgement that they are appeals to authority.
Whilst the author clearly has a belief that falls down on one side of the debate, I hope folks can engage with the "Should we abandon everything we know" question, which I think is the crux of things. Evidence that AI-driven-development is a valuable paradigm shift is thin on the ground, and we've done paradigm shifts before which did not really work out, despite massive support for them at the time. (Object-Oriented-Everything, Scrum, etc.)

by nyrulez

5 subcomments

Bold claims that writing code was never the bottleneck. It may not be the only bottleneck but we conveniently move goal posts now that there is a more convenient mechanism and our profession is under threat.

0 subcomment

by eleventhborn

0 subcomment

I feel there is a set of codebases in which LLMs aren't showing the 2-10x lift in productivity.
There is also a set of codebases in which LLMs are one-shotting the most correct code and even finding edgecases that would've been hard to find in human reviews.
At a surface level, it seems obvious that legacy codebases tend to fall in the first category and more greenfield work falls in the second category.
Perhaps, this signals an area of study where we make codebases more LLM-friendly. It needs more research and a catchy name.
Also, certain things that we worry about as software artisans like abstractions, reducing repeated code, naming conventions, argument ordering,... is not a concern for LLMs. As long as LLMs are consistent in how they write code.
For e.g. One was taught that it is bad to have multiple "foo()" implementations. In LLM world, it isn't _that_ bad. You can instruct the LLM to "add feature x and fix all the affected tests" (or even better "add feature x to all foo()") and if feature x relies on "foo()", it fixes every foo() method. This is a big deal.

by ChicagoDave

4 subcomments

I continue to jump into these discussions because I feel like these upvoted posts completely miss what’s happening…
- guardrails are required to generate useful results from GenAI. This should include clear instructions on design patterns, testing depth, and iterative assessments.
- architecture decision records are one useful way to prevent GenAI from being overly positive.
- very large portions of code can be completely regenerated quickly when scope and requirements change. (skip debugging - just regenerate the whole thing with updated criteria)
- GenAI can write thorough functional and behavioral unit tests. This is no longer a weakness.
- You must suffer the questions and approvals. At no time can you let agents run for extended periods of time on progressive sets of work. You must watch what is generated. One thing that concerns me about the new 1mm context on Claude Code is many will double down on agent freedom. You can’t. You must watch the results and examine functionality regularly.
- No one should care about actual code ever again. It’s ephemeral. The role of software engineering is now molding features and requirements into functional results. Choosing Rust, C#, Java, or Typescript might matter depending on the domain, but then you stop caring and focus on measuring success.
My experience is rolled up in https://devarch.ai/ and I know I get productive and testable results using it everyday on multiple projects.

by greggyb

1 subcomments

Hey, author here. Never thought I'd see my pokey little blog on HN and all that.
Happy to discuss further.

by galbar

0 subcomment

This article describes the body of knowledge I was taught when I joined the industry and parallels with my experience with and thoughts about AI.
I have come to the realization that most people in the industry don't know this body of knowledge, or even that it exists.
I'm now seeing the same people trying to solve their ineffectiveness with AI.
I don't know what to think about this situation. My intuition hints at it not being good.

by autonomousErwin

0 subcomment

I think there's some goldilocks speed limit for using these tools relative to your skillset. When you're building, you forget that you're also learning - which is why I actually favour some AI code editors that aren't as powerful because it gets me to stop and think.

by jwilliams

2 subcomments

> Humans and LLMs both share a fundamental limitation. Humans have a working memory, and LLMs have a context limit.
But there’s a more important difference: I can’t spin up 20 decent human programmers from my terminal.
The argument that "code was never the bottleneck" is genuinely appealing, but it hasn’t matched my experience at all. I’m getting through dramatically more work now. This is true for my colleagues too.
My non-technical niece recently built a pretty solid niche app with AI tools. That would have been inconceivable a few years ago.

by jwpapi

1 subcomments

There is a saying you need to write an essay 3 times. The first time its puked out, the second is decent and the third is good.
It’s quite similar with code, and with code less is more. for try 1 and 2

by emp17344

4 subcomments

The collaboration aspect is what many AI enthusiasts miss. As humans, our success is dependent on our ability to collaborate with others. You may believe that AI could replace many individual software engineers, but if it does so at the expense of harming collaboration, it’s a massive loss. AI tools are simply not good at collaborating. When you add many humans to a project, the result becomes greater than the sum of its parts. When you add many AI tools to a project, it quickly becomes a muddled mess.

by swalsh

2 subcomments

Speak for yourself, I have never thrown away code at this rate in my entire career. I couldn't keep up this pace without AI codegen.

by avabuildsdata

1 subcomments

honestly the thing that trips me up is when codegen makes me feel productive but I haven't actually validated anything. like I'll have claude write a whole data pipeline in 20 minutes and then spend 2 hours debugging edge cases it didn't think about because it doesn't know our data
the speed is real but it mostly just moves where I spend my time. less typing, more reading and testing. which is... fine? but it's not the 10x thing people keep claiming

by gaigalas

1 subcomments

In practical terms, "productivity" is any metric that people with power can manipulate (cheating numbers, changing narratives, etc) to affect behavior of others to their interests.
ALL OF IT is meaningless. It's a pointless discussion.

by fulafel

0 subcomment

Productivity in econ means how many units of output you get, not if they are good. So codegen is productivity in this sens, but not what you want.

by slopinthebag

0 subcomment

It's so difficult to quantify productivity over an entire field, especially when it's so vast. Chris Lattner recently concluded this about LLM tooling [0]:
> AI systems can internalize the textbook knowledge of a field and apply it coherently at scale. AI can now reliably operate within established engineering practice. This is a genuine milestone that removes much of the drudgery of repetition and allows engineers to start closer to the state of the art.
This matches my experience, there is a lot of code that we probably should not need to write and rewrite anymore but still do because this field has largely failed at deriving complete and reusable solutions to trivial problems - there is a massive coordination problem that has fragmented software across the stack and LLMs provide one way of solving it by generating some of the glue and otherwise trivial but expensive and unproductive interop code required.
But the thing about productivity is that it's not one thing and cannot be reduced to an anecdote about a side-project, or a story about how a single company is introducing (or mandating) AI tooling, or any single thing. Being able to generate a bunch of code of varying quality and reliability is undeniably useful, but there are simply too many factors involved to make broad sweeping claims about an entire industry based on a tool that is essentially autocomplete on crack. Thus it's not surprising that recent studies have not validated the current hype cycle.
[0] https://www.modular.com/blog/the-claude-c-compiler-what-it-r...

by zer00eyz

2 subcomments

I went to look at some of the authors other posts and found this:
https://www.antifound.com/posts/advent-of-code-2022/
So much of our industry has spent the last two decades honing itself into a temple built around the idea of "leet code". From the interview to things like advent of code.
Solving brain teasers, knowing your algorithms cold in an interview was always a terrible idea. And the sort of engineers it invited to the table the kinds of thinking it propagated were bad for our industry as a whole.
LLM's make this sort of knowledge, moot.
The complaints about LLM's that lack any information about the domains being worked in, the means of integration (deep in your IDE vs cut and paste into vim) and what your asking it to do (in a very literal sense) are the critical factors that remain "un aired" in these sorts of laments.
It's just hubris. The question not being asked is "Why are you getting better results than me, am I doing something wrong?"

by nubg

3 subcomments

For me it's simple:
1. Assume you're to work on product/feature X.
2. If God were to descend and give you a very good, reality-tested spec:
3. Would you be done faster? Of course, because as every AI doomer says, writing code was never the bottleneck!!1!
4. So the only bottleneck is getting to the spec.
5. Guess what AI can help you with as well, because you can iterate out multiple versions with little mental effort and no emotional sunk cost investment?
ergo coding is a solved problem

by nubg

3 subcomments

rules of thumb for when to take blog posts about AI coding seriously:
- must be using the latest state of the art model from the big US labs
- must be on a three digit USD per month plan
- must be using the latest version of a full major harness like codex, opencode, pi
- agent must have access to linting, compilation tools and IDE feedback
- user must instruct agent to use test driven development and write tests for everything and only consider something done if tests pass
- user must give agent access to relevant documentation, ie by cloning relevant repositories etc
- user must use plan mode and iterate until happy before handing off to agent
- (list is growing every month)
---
if the author of a blog post about AI coding doesnt respect all of these, reading his blog posts is a waste of time because he doesn't follow best practices

by agent5ravi

3 subcomments

[flagged]

by aneyadeng

0 subcomment

[dead]

by some_random

5 subcomments

You can write cope like this all you want but it doesn't change the fact I can ship a feature in few days that previously would have taken me a few weeks.