FRESH

Hacker News

Home

Why language models hallucinate

260 points by simianwords

by fumeux_fume

7 subcomments

I like that OpenAI is drawing a clear line on what “hallucination” means, giving examples, and showing practical steps for addressing them. The post isn’t groundbreaking, but it helps set the tone for how we talk about hallucinations.
What bothers me about the hot takes is the claim that “all models do is hallucinate.” That collapses the distinction entirely. Yes, models are just predicting the next token—but that doesn’t mean all outputs are hallucinations. If that were true, it’d be pointless to even have the term, and it would ignore the fact that some models hallucinate much less than others because of scale, training, and fine-tuning.
That’s why a careful definition matters: not every generation is a hallucination, and having good definitions let us talk about the real differences.

by aleph_minus_one

3 subcomments

> Think about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero. In the same way, when models are graded only on accuracy, the percentage of questions they get exactly right, they are encouraged to guess rather than say “I don’t know.”
To me, this seems to be an "US-American" way of thinking about multiple-choice tests. Other common ways to grade multiple-choice test that I have seen commonly are:
1. If the testee has the information that exactly one of N given choices is correct:
1.1 Give N-1 points for the correct answer, and -1 [negative one] point(s) for a wrong answer. This way, if the testee just answers the questions randomly, he will as expected value score 0 points.
1.2 A more brutal way if N>=3: the correct answer gives 1 point, all wrong answers give -1 points. You should learn your lesson only to give an answer if it is [alliteration unintended :-) ] correct (if N=2, the grading is identical to 1.1).
2. If there are possibly multiple correct answers, turn each item into choices of "yes" or "no" (with the option to give no answer). The correct choice gives you 1 point, the wrong gives you -1 point (i.e. as in 1.1).

by rhubarbtree

9 subcomments

I find this rather oddly phrased.
LLMs hallucinate because they are language models. They are stochastic models of language. They model language, not truth.
If the “truthy” responses are common in their training set for a given prompt, you might be more likely to get something useful as output. Feels like we fell into that idea and said - ok this is useful as an information retrieval tool. And now we use RL to reinforce that useful behaviour. But still, it’s a (biased) language model.
I don’t think that’s how humans work. There’s more to it. We need a model of language, but it’s not sufficient to explain our mental mechanisms. We have other ways of thinking than generating language fragments.
Trying to eliminate cases where a stochastic model the size of an LLM gives “undesirable” or “untrue” responses seems rather odd.

by amelius

10 subcomments

They hallucinate because it's an ill-defined problem with two conflicting usecases:
1. If I tell it the first two lines of a story, I want the LLM to complete the story. This requires hallucination, because it has to make up things. The story has to be original.
2. If I ask it a question, I want it to reply with facts. It should not make up stuff.
LMs were originally designed for (1) because researchers thought that (2) was out of reach. But it turned out that, without any fundamental changes, LMs could do a little bit of (2) and since that discovery things have improved but not to the point that hallucination disappeared or was under control.

by roxolotl

4 subcomments

This seems inherently false to me. Or at least partly false. It’s reasonable to say LLMs hallucinate because they aren’t trained to say they don’t have a statistically significant answer. But there is no knowledge of correct vs incorrect in these systems. It’s all statistics so what OpenAI is describing sounds like a reasonable way to reduce hallucinations but not a way to eliminate them nor the root cause.

by yreg

1 subcomments

Maybe it goes against the definition but I like saying that _all_ output is a hallucination, when explaining LLMs.
It just happens that a lot of that output is useful/corresponding with the real world.

by kingstnap

3 subcomments

There is this deeply wrong part of this paper that no one has mentioned:
The model head doesn't hallucinate. The sampler does.
If you ask an LLM when x was born and it doesn't know.
And you take a look at the actual model outputs which is a probability distribution over tokens.
IDK is cleanly represented as a uniform probability Jan 1 to Dec 31
If you ask it to answer a multiple choice question and it doesn't know. It will say this:
25% A, 25% B, 25% C, 25%D.
Which is exactly, and correctly, the "right answer". The model has admitted it doesn't know. It doesn't hallucinate anything.
In reality we need something smarter than a random sampler to actually extract this information out. The knowledge and lack of knowledge is there, you just produced bullshit out of it.

by thomasboyer

2 subcomments

Great post. Teaching the models to doubt, to say "I don't know"/"I'm unsure"/"I'm sure" is a nice way to make them much better.

by didibus

1 subcomments

When tuning predictive models you always have to balance precision and recall because 100% accuracy is never going to happen.
In LLMs that balance shows up as how often the model hallucinates versus how often it says it doesn’t know. If you push toward precision you end up with a model that constantly refuses: What’s the X of Y? I don’t know. Can you implement a function that does K? I don’t know how. What could be the cause of G? I can’t say. As a user that gets old fast, you just want it to try, take a guess, let you be the judge of it.
Benchmarks and leaderboards usually lean toward recall because a model that always gives it a shot creates a better illusion of intelligence, even if some of those shots are wrong. That illusion keeps users engaged, which means more users and more money.
And that's why LLM hallucinates :P

by robotcapital

4 subcomments

It’s interesting that most of the comments here read like projections of folk-psych intuitions. LLMs hallucinate because they “think” wrong, or lack self-awareness, or should just refuse. But none of that reflects how these systems actually work. This is a paper from a team working at the state of the art, trying to explain one of the biggest open challenges in LLMs, and instead of engaging with the mechanisms and evidence, we’re rehashing gut-level takes about what they must be doing. Fascinating.

by mqus

0 subcomment

I think one of the main problems is the dataset it is trained on, which is written text. How much answers with statements are in a given text, compared to a "I don't know"? I think the "I don't know"s are much less represented. Now go anywhere on the internet where someone asks a question (the typical kind of content LLMs are trained on) and the problem is even bigger. You either get no textual answer or someone that gives some answer (that might even be false). You never get an answer like "I don't know", especially for questions that are shouted into the void (compared to asking a certain person). And it makes sense. I wouldn't start to answer every stackoverflow question with "I don't know" tomorrow, it would just be spam.
For me, as a layman (with no experience at all about how this actually works), this seems to be the cause. Can we work around this? Maybe.

by robertclaus

1 subcomments

While I get the academic perspective of sharing these insights, this article comes across as corporate justifying/complaining that their model's score is lower than it should be on the leaderboards... by saying the leaderboards are wrong.
Or an even darker take is that its coorporate saying they won't prioritize eliminating hallucinations until the leaderboards reward it.

by juancn

1 subcomments

This is fluff, hallucinations are not avoidable with current models since those are part of the latent space defined by the model and the way we explore it, you'll always find some.
Inference is kinda like doing energy minimization on a high dimensional space, the hallucination is already there, for some inputs you're bound to find them.

by d4rkn0d3z

0 subcomment

This is a case of the metric becoming the target. The tools used to evaluate LLM performance are shaping the LLM. First you make your tools then your tools make you.
If we take a formal systems approach, then an LLM is a model of a complex hierarchy of production rules corresponding to the various formal and informal grammatical, logical, and stylistic rules and habits employed by humans to form language that expresses their intelligence. It should not be surprising that simply executing the production rules, or a model thereof, will give rise to sentences that cannot be assigned a meaning. It should also give rise to sentences that we cannot prove or make sense of immediately but we would not want to discard these due to uncertainty. Why? because every once in a while the sentence that would be culled is actually the stroke of brilliance we are looking for, uncertainty be damned. The citation here would be literally nearly every discovery ever made.
When I recall information and use it, when I "think", I don't just produce sentences by the rules, formal and informal, I don't consider at all how often I have seen one word precede another in past, rather as I meandre the landscape of a given context, a thought manifold if you will, I am constantly evaluating whether this is in contradiction with that, if this can be inferred from that via induction or deduction, does this preclude that, etc.. That is the part that is missing from an LLM; The uncanny ability of the human mind to reproduce the entire manifold of concepts as they relate to one another in a mesh from any small piece of the terrain that it might recall, and to verify anew that they all hang together unsupported by one's own biases.
The problem is that just as the scarcity of factual information in the corpus makes it difficult to produce, so is actual reasoning rarefied among human language samples. Most of what appears as reasoning is language games and will to power. The act of reasoning in an unbiased way is so foreign to humans, so painful and arduous, so much like bending over backwards or swimming upstream against a strong current of will to power, that almost nobody does it for long.

by sp1982

0 subcomment

This makes sense. I recently did an experiment to test GPT5 on hallucinations on cricket data where there is a lot of statistical pressure. It is far better to say idk than a wrong answer. Most current benchmarks don’t test for that. https://kaamvaam.com/machine-learning-ai/llm-eval-hallucinat...

by manveerc

1 subcomments

Maybe I am oversimplifying it, but isn’t the reason that they are lossy map of worlds knowledge and this map will never be fully accurate unless it is the same size as the knowledge base.
The ability to learn patterns and generalize from them adds to this problem, because people then start using it for usecases it will never be able to solve 100% accurately (because of the lossy map nature).

by williamtrask

0 subcomment

IMO - this paper is right about a major contributing factor to hallucinations but wrong about the cause
LLM hallucinations are closer to a cache miss.
https://x.com/iamtrask/status/1964403351116009671

by cainxinth

1 subcomments

I find the leader board argument a little strange. All their enterprise clients are clamoring for more reliability from them. If they could train a model that conceded ignorance instead of guessing and thus avoid hallucinations, why aren't they doing that? Because of leader board optics?

by e3bc54b2

2 subcomments

Hallucination is all an LLM does. That is their nature, to hallucinate.
We just happen to find some of these hallucinations useful.
Let's not pretend that hallucination is a byproduct. The usefulness is the byproduct. That is what surprised the original researchers on transformer performance, and that is why the 'attention is all you need' paper remains such a phenomenon.

by jrm4

0 subcomment

Yeah, no, count me in with those who think that "All they do is hallucinate" is the correct way to say this and anything else dangerously obscures things.
More than anything, we need transparency on how these things work. For us and for the general public.
"Hallucination" introduces the dangerous idea that "them getting things wrong" is something like a "curable disease" and not "garbage in garbage out."
No. This is as stupid as saying Google telling me a restaurant is open when it's closed is a "hallucination." Stop personifying these things.

by kouru225

2 subcomments

AI hallucination is an inherent problem of AI. You can mitigate it, but the whole point of AI IS hallucination. If the result is useful to us, we don’t call it anything. If the result is not useful to us, we call it “hallucination”

by amw-zero

0 subcomment

I love the euphemistic thinking. “We built something that legitimately doesn’t do the thing that we advertise, but when it doesn’t do it we shall deem that hallucination.”

by intended

1 subcomments

> a generated factual error cannot be grounded in factually correct training data.
This is only true given a corpus of data large enough, and enough memory to capture as many unique dimensions as required no?
> However, a non-hallucinating model could be easily created, using a question-answer database and a calculator, which answers a fixed set of questions such as “What is the chemical symbol for gold?” and well-formed mathematical calculations such as “3 + 8”, and otherwise outputs IDK.
This is… saying that if you constrain the prompts and the training data, you will always get a response which is either from the training data, or IDK.
Which seems to be a strong claim, at least in my ignorant eyes.?
This veers into spherical cow territory, since you wouldn’t have the typical language skills we associate with an LLM, because you would have to constrain the domain, so that it’s unable to generate anything else. However many domains are not consistent and at their boundaries, would generate special cases. So in this case, being able to say IDK, would only be possible for a class of questions the model is able to gauge as outside its distribution.
Edit: I guess that is what they are working to show? That with any given model, it will hallucinate, and these are the bounds?

by humanfromearth9

1 subcomments

LLMs do not hallucinate. They just choose the most probabilistic next token. Sometimes, we, humans, interpret this as hallucinating, not knowing any better, not having any better vocabulary, but being able to refrain from anthropomorphizing the machine.

by johnea

0 subcomment

I think a better title would be:
"Why do venture capital funded startups try to turn PR propaganda terms into widely used technical jargon"
Supporting points:
1) LLMs are not intelligence in any form, artificial or otherwise.
2) Hallucination is a phenomenon of a much more complex conscious entity. LLM's are not conscious, and therefore can't hallucinate in any way similar to a conscious entity.
3) Anthropomorphizing inanimate systems is a common phenomenon in human psychology.
Please stop spreading PR propaganda as if it were technical fact.
A reference from today's feed:
https://www.theatlantic.com/podcasts/archive/2025/09/ai-and-...

by farceSpherule

1 subcomments

I wish they would come up with a better term. Computers do not have brains or conscientiousness.
They erroneously construct responses (i.e., confabulation).

by ACCount37

1 subcomments

This mostly just restates what was already well known in the industry.
Still quite useful, because, looking at the comments right now: holy shit is the "out of industry knowledge" on the topic bad! Good to have something to bring people up to speed!
Good to see OpenAI's call for better performance evals - ones that penalize being confidently incorrect at least somewhat.
Most current evals are "all of nothing", and the incentive structure favors LLMs that straight up guess. Future evals better include a "I don't know" opt-out, and a penalty for being wrong. If you want to evaluate accuracy in "fuck it send it full guess mode", there might be a separate testing regime for that, but it should NOT be the accepted default.

by ahmedgmurtaza

0 subcomment

Totally agreed with majorrity of the views

by mannykannot

0 subcomment

I'm generally OK with the list of push-backs against common misconceptions in the summary, but I have my doubts about the second one:
Claim: Hallucinations are inevitable. Finding: They are not, because language models can abstain when uncertain.
...which raises the question of how reliable the uncertainty estimate could get (we are not looking for perfection here: humans, to varying degrees, have the same problem.)
For a specific context, consider those cases where LLMs are programming and invent a non-existent function: are they usually less certain about that function than they are about the real functions they use? And even if so, abandoning the task with the equivalent of "I don't know [how to complete this task]" is not very useful, compared to what a competent human programmer would do: check whether such a function exists, and if not, decide whether to implement it themselves, or backtrack to the point where they can solve the problem without it.
More generally, I would guess that balancing the competing incentives to emit a definite statement or decline to do so could be difficult, especially if the balance is sensitive to the context.

0 subcomment

by the_af

0 subcomment

Some people here in the comments are arguing that the LLM "understands" what is "true" and "false", that is somewhat capable of reasoning, etc, but I still find it quite easy (with GPT-5) to break its facade of "reasoning".
I asked it to play a word game. This is very simple, and a very short session too. It failed in its very first response, and then it failed in explaining why it failed. All with total confidence, no hesitation.
Nobody fluent in English would fail so catastrophically. I actually expected it to succeed:
https://chatgpt.com/share/68bcb490-a5b4-8013-b2be-35d27962ad...
It's clear by this failure model the LLM doesn't understand anything.
Edit: to be clear, as the session goes longer it becomes more interesting, but you can still trip the LLM up in ways no human "understanding" the game would. My 6-year old plays this game better, because she truly understands... she can trip up, but not like this.

by hankchinaski

0 subcomment

because they are glorified markov chains?

by Waterluvian

1 subcomments

> Abstaining is part of humility, one of OpenAI’s core values .
Is this PR fluff or do organizations and serious audiences take this kind of thing seriously?

by charcircuit

0 subcomment

They shouldn't frame hallucination as a problem that is solvable provided they want to have a useful model (saying I don't know to every question is not useful). The data from the training may be wrong or out of date. Even doing a web search could find a common misconception instead of the actual answer.

by nurettin

0 subcomment

We program them to fill in the blanks, and then sit there wondering why they did.
Classic humans.

by emily77ff

0 subcomment

[dead]

by sublinear

0 subcomment

Wow they're really circling the drain here if they have to publish this.
It took a few years, but the jig is up. The layperson now has a better understanding of basic computer science and linguistics to see things as they are. If anything we now have a public more excited about the future of technology and respectful of the past and present efforts that don't depend so heavily on statistical methods. What an expensive way to get us there though.

by lapcat

1 subcomments

Let's be honest: many users of LLMs have no interest in uncertainty. They don't want to hear "I don't know" and if given that response would quickly switch to an alternative service that gives them a definitive answer. The users would rather have a quick answer than a correct answer. People who are more circumspect, and value truth over speed, would and should avoid LLMs in favor of "old-fashioned methods" of discovering facts.
LLMs are the fast food of search. The business model of LLMs incentivizes hallucinations.

by xyzelement

0 subcomment

The author mentioned his own name so I looked him up. Computer scientist son of famous israeli professors married to famous computer scientist daughter of another famous israeli professor. I hope they have kids because those should be some pretty bright kids.

by Pocomon

0 subcomment

The output of language models can be considered a form of hallucination because these models do not possess real understanding or factual knowledge about the underlying concepts. Instead, they generate text by statistically predicting and assembling words based on vast training data and the input prompts, without true comprehension.
Since the training data can contain inaccuracies, conflicting information, or low-frequency facts that are essentially random, models can produce plausible-sounding but false statements. Unlike humans, language models have no awareness or grounding in real-world concepts; their generation is essentially an amalgam of stored patterns and input cues rather than grounded knowledge.
Furthermore, evaluation methods that reward accuracy without penalizing guessing encourage models to produce confident but incorrect answers rather than admit uncertainty or abstain from answering. This challenge is intrinsic to how language models generate fluent language: they lack external verification or true understanding, making hallucinations an inherent characteristic of their outputs rather than a malfunction.
--
| a. What's with the -minus votes?
| b. I was only quoting ChatGPT :]