FRESH

Hacker News

Home

Let's be honest, Generative AI isn't going all that well

229 points by 7777777phil

by mattmaroon

28 subcomments

Meanwhile, my cofounder is rewriting code we spent millions of salary on in the past by himself in a few weeks.
I myself am saving a small fortune on design and photography and getting better results while doing it.
If this is not all that well I can’t wait until we get to mediocre!

by tombert

8 subcomments

I find it a bit odd that people are acting like this stuff is an abject failure because it's not perfect yet.
Generative AI, as we know it, has only existed ~5-6 years, and it has improved substantially, and is likely to keep improving.
Yes, people have probably been deploying it in spots where it's not quite ready but it's myopic to act like it's "not going all that well" when it's pretty clear that it actually is going pretty well, just that we need to work out the kinks. New technology is always buggy for awhile, and eventually it becomes boring.

by 1a527dd5

4 subcomments

A year ago I would have agreed wholeheartedly and I was a self confessed skeptic.
Then Gemini got good (around 2.5?), like I-turned-my-head good. I started to use it every week-ish, not to write code. But more like a tool (as you would a calculator).
More recently Opus 4.5 was released and now I'm using it every day to assist in code. It is regularly helping me take tasks that would have taken 6-12 hours down to 15-30 minutes with some minor prompting and hand holding.
I've not yet reached the point where I feel letting is loose and do the entire PR for me. But it's getting there.

by dreadsword

2 subcomments

This feels like a pretty low effort post that plays heavily to superficial reader's cognitive biases.
I work commercializing AI in some very specific use cases where it extremely valuable. Where people are being lead astray is layering generalizations: general use cases (copilots) deployed across general populations and generally not doing very well. But that's PMF stuff, not a failure of the underlying tech.

by gejose

9 subcomments

I believe Gary Marcus is quite well known for terrible AI predictions. He's not in any way an expert in the field. Some of his predictions from 2022 [1]
> In 2029, AI will not be able to watch a movie and tell you accurately what is going on (what I called the comprehension challenge in The New Yorker, in 2014). Who are the characters? What are their conflicts and motivations? etc.
> In 2029, AI will not be able to read a novel and reliably answer questions about plot, character, conflicts, motivations, etc. Key will be going beyond the literal text, as Davis and I explain in Rebooting AI.
> In 2029, AI will not be able to work as a competent cook in an arbitrary kitchen (extending Steve Wozniak’s cup of coffee benchmark).
> In 2029, AI will not be able to reliably construct bug-free code of more than 10,000 lines from natural language specification or by interactions with a non-expert user. [Gluing together code from existing libraries doesn’t count.]
> In 2029, AI will not be able to take arbitrary proofs from the mathematical literature written in natural language and convert them into a symbolic form suitable for symbolic verification.
Many of these have already been achieved, and it's only early 2026.
[1]https://garymarcus.substack.com/p/dear-elon-musk-here-are-fi...

by daedrdev

1 subcomments

This post is literally just 4 screenshots of articles, not even its own commentary or discussion.

by saberience

0 subcomment

Gary Marcus (probably): "Hey this LLM isn't smarter than Einstein yet, it's not going all that well"
The goalposts keep getting pushed further and further every month. How many math and coding Olympiads and other benchmarks will LLMs need to dominate before people will actually admit that in some domains it's really quite good.
Sure, if you're a Nobel prize winner or PhD then LLMs aren't as good as you yet, but for 99% of the people in the world, LLMs are better than you at Math, Science, Coding, and every language probably except your native language, and it's probably better at you at that too...

by didibus

1 subcomments

Ignoring the actual poor quality of this write-up, I think we don't know how well GenAI is going to be honest. I feel we've not been able to properly measure or assess it's actual impact yet.
Even as I use it, and I use it everyday, I can't really assess its true impact. Am I more productive or less overall? I'm not too sure. Do I do higher quality work or lower quality work overall? I'm not too sure.
All I know, it's pretty cool, and using it is super easy. I probably use it too much, in a way, that it actually slows things down sometimes, when I use it for trivial things for example.
At least when it comes to productivity/quality I feel we don't really know yet.
But there are definite cool use-cases for it, I mean, I can edit photos/videos in ways I simply could not before, or generate a logo for a birthday party, I couldn't do that before. I can make a tune that I like, even if it's not the best song in the world, but it can have the lyrics I want. I can have it extract whatever from a PDF. I can have it tell me what to watch out for in a gigantic lease agreement I would not have bothered reading otherwise.
I can have it fix my tests, or write my tests, not sure if it saves me time, but I hate doing that, so it definitely makes it more fun and I can kind of just watch videos at the same time, what I couldn't before. Coding quality of life improvements are there too, I want to generate a sample JSON out of a JSONSchema, and so on. If I want, I can write the a method using English prompts instead of the code itself, might not truly be faster or not, not sure, but sometimes it's less mentally taxing, depending on my mood, it can be more fun or less fun, etc.
All those are pretty awesome wins and a sign that for sure those things will remain and I will happily pay for them. So maybe it depends on what you expected.

by thechao

0 subcomment

You're absolutely right!
The irony of a five sentence article making giant claims isn't lost on me. Don't get me wrong: I'm amenable to the idea; but, y'know, my kids wrote longer essays in 4th grade.

by kennyadam

0 subcomment

I’ve been using Claude Code, Gemini 3 Pro, and Nano Banana Pro to plan, code, and create custom UI elements for dozens of time-saving applications. For years, I have been searching high and low for existing solutions, but all I found were either overpriced cloud offerings that were bloated with endless features I didn’t need and just complicated the UI, or abandoned GitHub repos consisting of an initial commit and a roadmap that has been waiting eight years for its first update and what code was present was half baked and out of date. The reality is that my requirements are so specific to my workflow that until these latest models came along, building exactly what I needed in a matter of hours for a cost of $20 a month was inconceivable. Now I provide a description of what functionality I need, some sketches of the UI I made on my ipad with an apple pencil and after a bit of back and forth to get everything dialled in and I’ve created a bit of software that will save me dozens if not hundreds of hours of previously tedious manual work.

by emp17344

3 subcomments

Guessing this isn’t going to be popular here, but he’s right. AI has some use cases, but isn’t the world-changing paradigm shift it’s marketed as. It’s becoming clear the tech is ultimately just a tool, not a precursor to AGI.

by pj4533

1 subcomments

All I know is that I have built more in the past 10 months than I ever have. How do you quantify for the skeptics the mental shift that happens when you know you can just build stuff now?
COULD I do this stuff before? Sure. But I wouldn’t have. Life gets in the way. Now, the bar is low so why not build stuff? Some of it ships, some of it is just experimentation. It’s all building.
Trying to quantify that shift is impossible. It’s not a multiplier to productivity you measure by commits. It’s a builder mind shift.

by sghiassy

0 subcomment

LLMs help me read code 10x faster - I’ll take the win and say thanks

by smashed

1 subcomments

Should have used an LLM to proofread.. LLMs can still cannot be trusted?

by mythrwy

0 subcomment

It's going well for coding. I just knocked out a mapping project that would have been a week+ of work (with docs and stackoverflow opened in the background) in a few hours.
And yes, I do understand the code and what is happening and did have to make a couple of adjustments manually.
I don't know that reducing coding work justifies the current valuations, but I wouldn't say it's "not going all that well".

by moonshotideas

1 subcomments

How long do you think it will be until the “ai isn’t doing anything” people are going away 1 month, 6 months, I’d say 1 Year at the most, anyone who has used Claude code since Dec 1st knows this in their bones, so I’d just let these people shout from the top of the hill until they run out of steam…
Right around then, we can send a bunch of reconnaissance teams out to the abandoned Japanese islands to rescue them from the war that’s been over for 10 years - hopefully they can rejoin society, merge back with reality and get on with their lives

by jaffee

1 subcomments

What a joke this guy is. I can sit down and crank out a real, complex feature in a couple hours that would have previously taken days and ship it to the users of our AI platform who can then respond to RFQs in minutes where they would have previously spent hours matching descriptions to part numbers manually.
...and yet we still see these articles claiming LLMs are dying/overhyped/major issues/whatever.
Cool man, I'll just be over here building my AI based business with AI and solving real problems in the very real manufacturing sector.

by m463

1 subcomments

I see stuff like this and think of these two things:
1) https://en.wikipedia.org/wiki/Gartner_hype_cycle
or
2) "First they ignore you, then they laugh at you, then they fight you, then you win."
or maybe originally:
"First they ignore you. Then they ridicule you. And then they attack you and want to burn you. And then they build monuments to you"

by herunan

0 subcomment

First of all, popping in a few screenshots of articles and papers is not proper analysis.
Second of all, GenAI is going well or not depending on how we frame it.
In terms of saving time, money and effort when coding, writing, analysing, researching, etc. It’s extremely successful.
In terms of leading us to AGI… GenAI alone won’t reach that. Current ROI is plateauing, and we need to start investing more somewhere else.

by dkobia

0 subcomment

Preaching to the wrong choir. The HN community is reaping massive benefits from generative AI.

by siscia

0 subcomment

I think that the wider industry is living right now what was coding and software engineering around 1 year or so ago.
Yeah you could ask ChatGPT or Claude to write code, but it wasn't really there.
It needs a while to adopt the model AND the UI. As in software are the first one because we are both makers and users.

by rpowers

1 subcomments

I keep reading comments that claim GenAI's positive traits, but this usually amounts to some toy PoC that very eerily mirrors work found in code bootcamps. You want an app that has logins and comments and upvotes? GenAI is going to look amazing setting up a non-relational db to your node backend.

by Jadiiee

0 subcomment

It's more about how you use it. It should be a source of inspo. Not the end all be all.

by afspear

0 subcomment

Meanwhile I'm over here reducing my ADO ticket time estimates by 75%.

by mrbluecoat

1 subcomments

> LLMs can still cannot be trusted
But can they write grammatically correct statements?

by unwise-exe

0 subcomment

Meanwhile $employer is continuing to migrate individual tasks to in-house AI tooling, and has licensed an off-the-shelf coding agent for all of us developers to put in our IDEs.

by nojvek

0 subcomment

Gary Marcus again. The chief doomer of AI where goal posts keep on moving.
Almost everyone around me, even the primary school kids use ChatGPT/Perplexity/Gemini/Claude in some form on almost a daily basis. The daily engagement is v strong.
The models keep improving every year. Nano banana gets text spot on, human anatomy of digits and toes is spot on. Deep Research mode is mind boggling. All the major vendors have some form of voice interaction, and it feels pretty good. I use perplexity talk feature while driving to learn deep about a topic of interest.
The trend is strong, betting against the trend isn't wise.
I can paste entire books and ask questions about certain pieces. The context windows nowadays are wild.
Price per token keeps on dropping, more capability keeps on coming online.
Gary offers no solutions, just complaints.

by tom_m

0 subcomment

It's going well in terms of being a valuable tool. It's not going well from an economic point of view. There's going to be winners and losers in this bubble. Things will settle and it will be commonplace technology in the future. Not going anywhere. It's just over hyped right now.
Then you consider the massive spend in data centers, the ram shortage, etc. The writing is on the wall.

by robertclaus

1 subcomments

Odds this was AI generated?

by amw-zero

0 subcomment

I’m starting to think this take is legitimately insane.
As said in the article, a conservative estimate is that Gen AI can currently do 2.5% of all jobs in the entire economy. A technology that is really only a couple of years old. This is supposed to be _disappointing_? That’s millions of jobs _today_, in a totally nascent form.
I mean I understand skepticism, I’m not exactly in love with AI myself, but the world has literally been transformed.

0 subcomment

by anarticle

1 subcomments

Download models you can find now and forever. The guardrails will only get worse, or models banned entirely. Whether it's because of "hurts people's health" or some other moral panic, it will kill this tech off.
gpt-oss isn't bad, but even models you cannot run are worth getting since you may be able to run them in the future.
I'm hedging against models being so nerfed they are useless. (This is unlikely, but drives are cheap and data is expensive.)

by meowface

3 subcomments

How on Earth do people keep taking Gary Marcus seriously?

by bawolff

0 subcomment

Holy moving goal posts batman!
I hate generative AI, but its inarguable what we have now would have been considered pure magic 5 years ago.

by wewewedxfgdf

0 subcomment

Haters gonna hate.

by fortyseven

0 subcomment

I've just started ignoring people like this. You think everything's going bad? Okay fine. You go ahead and keep believing that. Maybe you could get it printed on a sandwich board and walk up and down the street with it.

by joshcsimmons

0 subcomment

Huh?
Seems like black and white thinking to me. I had it make suggestions for 10 triage issues for my team today and agreed with all of its routings. That’s certainly better than 6 months ago.

by billsunshine

0 subcomment

a historic moron. Marcus will make Krugman's internet==fax machine look like a good prediction

by blindriver

0 subcomment

This entire take is nonsense.
I just used ChatGPT to diagnose a very serious but ultimately not-dangerous health situation last week and it was perfect. It literally guided me perfectly without making me panic and helped me understand what was going on.
We use ChatGPT at work to do things that we have literally laid people off for, because we don't need them anymore. This included fixing bugs at a level that is at least E5/senior software engineer. Sometimes it does something really bad but it definitely saves times and helps avoid adding headcount.
Generative AI is years beyond what I would have expected even 1 year ago. This guy doesn't know what he's talking about, he's just picking and choosing one-off articles that make it seem like it's supporting his points.

by w4yai

0 subcomment

[flagged]

by sublinear

0 subcomment

All this AI discussion has done is reveal how naive some people are.
You're not losing your job unless you work on trivial codebases. There's a very clear pattern what those are: startups, greenfield, games, junk apps, mindless busywork that probably has an existing better tool on github, etc. Basically anything that doesn't have any concrete business requirements or legal liability.
This isn't to say those codebases will always be trivial, but good luck cleaning that up or facing the reality of having to rewrite it properly. At least you have AI to help with boilerplate. Maybe you'll learn to read docs along the way.
The people claiming to be significantly more productive are either novice programmers or optimistic for unexplained reasons they're still trying to figure out. When they want to let us know, most people still won't care because it's not even the good kind of unreasonable that brings innovation.
The only real value in modern LLMs is that natural language processing is a lot better than it used to be.
Are we done now?

by segfaultex

1 subcomments

I wholeheartedly agree. Shitty companies steal art and then put out shitty products that shitty people use to spam us with slop.
The same goes for code as well.
I’ve explored Claude code/antigravity/etc, found them mostly useless, tried a more interactive approach with copilot/local models/ tried less interactive “agents”/etc. it’s largely all slop.
My coworkers who claim they’re shipping at warp speed using generative AI are almost categorically our worst developers by a mile.