But one thing that has scared me the most, is the trust of LLMs output to the general society. I believe that for software engineers it's really easy to see if it's being useful or not -- We can just run the code and see if the output is what we expected, if not, iterate it, and continue. There's still a professional looking to what it produces.
On the contrary, for more day-to-day usage of the general pubic, is getting really scary. I've had multiple members of my family using AI to ask for medical advice, life advice, and stuff were I still see hallucinations daily, but at the same time they're so convincing that it's hard for them not to trust them.
I still have seen fake quotes, fake investigations, fake news being spreaded by LLMs that have affected decisions (maybe, not as crucials yet but time will tell) and that's a danger that most software engineers just gross over.
Accountability is a big asterisk that everyone seems to ignore
I'm not a fan of this phrasing. Use of the terms "resistance" and "skeptics" implies they were wrong. It's important we don't engage in revisionist history that allows people in the future to say "Look at the irrational fear programmers had of AI, which turned out to be wrong!" The change occurred because LLMs are useful for programming in 2025 and the earliest versions weren't for most programmers. It was the technology that changed.
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
Seem to be almost absurd without further, concrete justification.
LLMs are still quite useful, I'm glad they exist and honestly am still surprised more people don't use them in software. Last year I was very optimistic that LLMs would entirely change how we write software by making use of them as a fundamental part of our programming tool kit (in a similar way that ML fundamentally changed the options available to programmers for solving problems). Instead we've just come up with more expensive ways to extend the chat metaphor (the current generation of "agents" is disappointingly far from the original intent of agents in AI/CS).
The thing I am increasingly confused about is why so many people continue to need LLMs to be more than they obviously are. I get why crypto boosters exist, if I have 100 BTC, I have a very clear interest getting others to believe that they are valuable. But with "AI", I don't quite get, for the non-VS/founder, why it matters that people start foaming out the mouth over AI rather than just using it for the things it's good at.
Though I have some growing sense that this need is related to another trend I've personally started with witness: AI psychosis is very real. I personally know an increasing number of people who are spiraling into an LLM induced hallucinated world. The most shocking was someone talking about how losing human relationships is inevitable because most people can't keep up with those enhanced by AI acceleration. On the softer end I know more and more people who quietly confess how much they let AI work as a perpetual therapist, guiding their every decision (which is more than most people would let a human therapist guide there directions).
This makes me think: I wonder if Goodhart's law[1] may apply here. I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend. Should we care or would it be ok for AI to produce code that passes all tests and is faster? Would the AI become good at creating explanations for humans as a side effect?
And if Goodhard's law doesn't apply, why is it? Is it because we're only doing RLVR fine-tuning on the last layers of the network so all the generality of the pre-training is not lost? And if this is the case, could this be a limitation in not being able to be creative enough to come up with move 37?
I'm not super up-to-date on all that's happening in AI-land, but in this quote I can find something that most techno-enthusiast seem to have decided to ignore: no, code is not free. There are immense resources (energy, water, materials) that go into these data centers in order to produce this "free" code. And the material consequences are terribly damaging to thousands of people. With the further construction of data centers to feed this free video coding style, we're further destroying parts of the world. Well done, AGI loverboys.
This one is bizarre, if true (I'm not convinced it is).
The entire purpose of the attention mechanism in the transformer architecture is to build this representation, in many layers (conceptually: in many layers of abstraction).
> 2. NOT have any representation about what they were going to say.
The only place for this to go is in the model weights. More parameters means "more places to remember things", so clearly that's at least a representation.
Again: who was pushing this belief? Presumably not researchers, these are fundamental properties of the transformer architecture. To the best of my knowledge, they are not disputed.
> I believe [...] it is not impossible they get us to AGI even without fundamentally new paradigms appearing.
Same, at least for the OpenAI AGI definition: "An AI system that is at least as intelligent as a normal human, and is able to do any economically valuable work."
sorry, I say it's folding the laundry. with an aging population, that's the most, if not only, useful thing.
Could not agree more. I myself started 2025 being very skeptical, and finished it very convinced about the usefulness of LLMs for programming. I have also seen multiple colleagues and friends go through the same change of appreciation.
I noticed that for certain task, our productivity can be multiplied by 2 to 4. So hence comes my doubts: are we going to be too many developers / software engineers ? What will happen for the rests of us ?
I assume that other fields (other than software-related) should also benefits from the same productivity boosts. I wonder if our society is ready to accept that people should work less. I think the more likely continuation is that companies will either hire less, or fire more, instead of accepting to pay the same for less hours of human-work.
I’m actually curious about this and would love pointers to the folks working in this area. My impression from working with LLMs is there’s definitely a “there” there with regards to intelligence - I find the work showing symbolic representation in the structure of the networks compelling - but the overall behavior of the model seems to lack a certain je ne sais quoi that makes me dubious that they can “cross the divide,” as it were. I’d love to hear from more people that, well, sais quoi, or at least have theories.
It's interesting that Terrence Tao just released his own blog post stating that they're best viewed as stochastic generators. True he's not an AI researcher, but it does sound like he's using AI frequently with some success.
"viewing the current generation of such tools primarily as a stochastic generator of sometimes clever - and often useful - thoughts and outputs may be a more productive perspective when trying to use them to solve difficult problems" [0].
Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.
Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.
> For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots
> In 2025 finally almost everybody stopped saying so.
There is still no evidence that LLMs are anything beyond "stochastic parrots". There is no proof of any "understanding". This is seeing faces in clouds.
> I believe improvements to RL applied to LLMs will be the next big thing in AI.
With what proof or evidence? Gut feeling?
> Programmers resistance to AI assisted programming has lowered considerably.
Evidence is the opposite, most developers do not trust it. https://survey.stackoverflow.co/2025/ai#2-accuracy-of-ai-too...
> It is likely that AGI can be reached independently with many radically different architectures.
There continues to be no evidence beyond "hope" that AGI is even possible, yet alone that Transformer models are the path there.
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
Again, nothing more than a gut feeling. Much like all the other AI hype posts this is nothing more than "well LLMs sure are impressive, people say they're not, but I think they're wrong and we will make a machine god any day now".
That's a weird thing to end on. Surely it's worth more than one sentence if you're serious about it? As it stands, it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks.
If AI is really that powerful and I should care about it, I'd rather hear about it without the scare tactics.
Here we go again. Statements with the single source in the head of the speaker. And it’s also not true. The llms still produce bad/irrelevant code at such rate that you can spend more time prompting than doing things yourself.
I’m tired of this overestimation of llms.
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
Yup, this will absolutely be a big driver of gains in AI for coding in the near future. We actually built a benchmark based on this exact principle: https://algotune.io/
Around the world people ask an LLM and get a response.
Just grouping and analysing these questions and solving them once centrally and then making the solution available again is huge.
Linearly solving the most asked questions and then the next one then the next will make, whatever system is behind it, smarter every day.
But did any AI researchers actually claim there was no representation of meaning? I thought generally, the criticism of LLMs was that while they do abstract from their corpus - ie, you can regard them as having a representation of "meaning" - it's tightly and inextricably tied to the surface level representation, it isn't grounded in models of the external world, and LLMs have poor ability to transfer that knowledge to other surface encodings.
I don't know who the "certain AI researchers" are supposed to be. But the "stochastic parrot" paper by Bender et al [1] says:
> Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind.
That's a very different objection to the one antirez describes - I think he's erecting a straw man. But I'd be happy to be corrected by anyone more familiar with the research.
This reminded me of the Don’t look up movie where they basically gambled with the humans extinction.
> Chain of thought is now a fundamental way to improve LLM output.
That kinda proves _that LLMs back then were pretty much stochastic parrots indeed_, and the skeptics were right at the time. Today, enthusiasts agree with what they previously said: without CoT, the AI feels underwhelming, repetitive and dumb and it's obvious that something more was needed.
Just search past discussions about it, people were saying the problem would be solved with "larger models" (just repeating marketing stuff) and were oblivious to the possibility of other kinds of innovations.
> The fundamental challenge in AI for the next 20 years is avoiding extinction.
That is a low level sick burn on whoever believes AI will be economically viable short-term. And I have to agree.
They are cool new tools use them where you can but there is a ton of research still left to do. Just lols at the hubris silicon valley will make something so smart it extincts humankind. It'll happen from the lack of water and heated planet first :)
The stocastic parrot argument is still debated but more nuanced than before. Although the original author still stands by the statement. Evidence of internal planning per model. Anthropic Attribution Graphs Research with some rhyming did support it but gemma didn't.
The idea of "understanding" is still up for debate as well. Sure, when models are directly trained on data there is representation. Othello-GPT Studies was one way to support but that was during training so some interal representation was created. Out of distribution task will collapse to confabulation. Apple's GSM-Symbolic Research seems to support that.
Chain of thought is a helpful tool but is untrustworthy at best. Anthropic themselves have showed this https://www.anthropic.com/research/reasoning-models-dont-say...
personally, as someone building on top of gen AI for a living, i finally bit the bullet on building using LLMs. it did reduce friction in things i don't like doing and did not explore as much. by acting as a catalyst when i needed to finally address them, it helped me get going and eventually become proficient in the core tech itself.
outside of work, however, i find people around me use the services much more than i do. sometimes it felt like the "big data is like teenage sex"[1], but some aspects were quite genuine. got better appreciation after trying them to better understand other people's perspective and to design better.
with "slop" as word of the year and people wondering if a random clip is AI, now more than ever the effects in general life seems apparent. it is not as sexy as "i will lose my job soon", but the effects are here and now. while the next year will be even more interesting, i can't wait for the bubble to burst.
It is easy to see that LLMs exclusively parrot by asking them about current political topics [1], because they cannot plagiarize settled history from Wikipedia and Britannica.
But of course there also is the equivalence between LLMs and Markov chains. As far as I can see, it does not rely on absurd equivalences like encoding all possible output states in an infinite Markov chain:
https://arxiv.org/abs/2410.02724
Then there is stochastic parrot research:
https://arxiv.org/abs/2502.08946
"The stochastic parrot phenomenon is present in LLMs, as they fail on our grid task but can describe and recognize the same concepts well in natural language."
As said above, this is obvious to anyone who has interacted with LLMs. Most researchers know what is expected of them if they want to get funding and will not research the obvious too deeply.
[1] They have Internet access of course.
If Antirez has never gotten an LLM to perform an absolutely embarrassing mistake, he must be very lucky or we should stop listening to him.
Programmers' resistance has not weakened. Since the ORCL drop of 40% anti-LLM opinions are censored and downvoted here. Many people have given up, and we always get articles from the same LLM influencers.
So nice to see people who think about this seriously converge on this. Yes. Creating something smarter than you was always going to be a sketchy prospect.
All of the folks insisting it just couldn't happen or ... well, there have just been so many objections. The goalposts have walked from one side of the field to the other, and then left the stadium, went on a trip to Europe, got lost in a beautiful little village in Norway, and decided to move there.
All this time though, the prospect of instantiating a something smarter than you (and yes, it will be smarter than you even if it's at human level because of electronic speeds...) This whole idea is just cursed and we should not do the thing.