FRESH

Hacker News

Home

The text in Claude Code’s “Extended Thinking” output

303 points by 0o_MrPatrick_o0

by StizzurpXDD

15 subcomments

This is not just Anthropic. Almost all big AI companies, including OpenAI and Google, hide their model's actual reasoning. This is because revealing the raw reasoning exposes exactly how the AI processes information. These companies spend in huge amounts on R&D to develop a thinking process that is superior to their competition. Exposing those thinking mechanics to competitors would completely defeat the purpose of their spending. They simply won't do it. It's like you telling your exact location to someone who is trying to hunt you down.

by furyofantares

1 subcomments

> It isn’t the actual thinking that drove the model’s actions in a session- but a summary of the thinking logic. This is like using saving a jpeg as a .bmp and then editing the .bmp and presenting it as a .jpeg. The conversion produces data loss.
You've got that backwards, .bmp is a lossless format and .jpeg is the lossy one.

by irthomasthomas

8 subcomments

I won't use or recommend models with hidden reasoning, (thats all American models). It's too much of a risk and makes prompt optimization harder. Risky because it makes it possible for an attacker to prompt inject the reasoning chain to carry out a secret objective, and to hide that from the summary and output.
Interleaved reasoning and function calling makes this even more dangerous. A model can call functions during the hidden reasoning phase. An attacker could then exfiltrate data from you while the reasoning summary hides it from the user.
It also makes it impossible to know if the model is doomplooping during reasoning and burning tokens for no reason, as gemini is want to do, which we know about because its hidden reasoning often leaks out when it doomloops.
When the models are AGI and secure from prompt injection I may stop caring, until then I want to know exactly what the model responds to my prompts. or exactly what the agent is doing on my behalf.
Edit, further reading: Fooling around with encrypted reasoning blobs https://blog.cryptographyengineering.com/2026/05/29/fooling-...

by craigmart

1 subcomments

This is something we have known for a very long time, and companies are not trying to hide that either. They do it to avoid letting competitors train their models on the CoTs

by datastoat

3 subcomments

I believe that chain-of-thought reasoning blocks don't really correspond to what humans think of as reasoning. (See section 6.2.2 of the Fable/Mythos system card about "illegible reasoning", and the questions raised by the Apple paper on "The illusion of thinking".) I assumed they obscure the reasoning blocks because if users saw what's going on they'd be alarmed. Just as I'd probably be alarmed if I saw what was really going on in the heads of my colleagues ...

by arjie

2 subcomments

I have a little note from the past about the thinking trace[0] where DeepSeek R1 produces a trace like this:
```
    (Dimethyl(oxo)-lambda6-sulfa雰囲idine)methane donate a CH2rola group occurs in reaction, Practisingproduct transition vs adds this.to productmodule. Indeed"come tally said Frederick would have 10 +1 =11 carbons. So answer q Edina is11.
```
And then concludes the 'right'[1] answer for a Chemistry question. If so, the thinking trace can be sort of nonsensical for a reader, though whether this is an idiosyncrasy of the model or a property of LLMs in general isn't clear to me yet. I talked to the author a while ago, but forgot to follow up since his paper was going to come out at NIPS or something, so if someone else finds it maybe they can share.
0: https://wiki.roshangeorge.dev/w/Blog/2025-10-12/Word_Magic#I...?
1: In the sense of true belief, I suppose

by kfarr

1 subcomments

Although it's a no no to anthropomorphize on HN, it's worth noting that some folks think humans are post-hoc rationalizers as well:
https://www.patheos.com/blogs/tippling/2013/11/14/post-hoc-r...
https://www.researchgate.net/publication/316045349_Post_Hoc_...

by segmondy

2 subcomments

What I find sad is how much Anthropic goes to hide your data, yet they are happy to slurp up all yours and most of you are happy to hand it over. ... then they turn around and compete with you by building your products that eat into your market. Anthropic believes their reasoning tokens is a moat and that it's giving other labs an edge and that's why they are hiding it. If they really believe that is their edge, then they are in for a surprise.

by ian_j_butler

2 subcomments

It's well-known that the reasoning model output is not necessarily faithful to the content of the thinking scratch pad anyway, even if you had it unsummarized and available verbatim.
Setting aside coding agents.. we really need this information to even pretend to evaluate the claims of stuff like mathematical breakthroughs, which is exactly why we will never see it. Very embarrassing to get the right answer for the wrong reason. But to give the models some credit, you could argue that even paying too much attention to the thinking is misunderstanding how CoT works. The argument would be that thinking in LLMs isn't really thinking, that it's self-reinforcement and circling to to encourage stability around beneficial attractors instead of degenerate ones. Can't have it both ways though: either the thinking is thinking and so it should be correct. Or the thinking is NOT thinking, and it's NOT real justification for the outcome, and these systems are even more hopelessly opaque than we usually assume.

by anuramat

2 subcomments

no way, the contents of "reasoning_summary" are summarized?
fyi openai does the same; not really surprising or particularly evil

by himata4113

1 subcomments

All this effort to hide thinking and opus 4.8 after 100k-200k tokens starts to leak it's own thinking. It's comedy really.

by sheepscreek

0 subcomment

The initial motivation for this was likely to thwart any competition. Already Anthropic has accused some companies of organized distillation efforts at a massive scale.
Back when I used antigravity, it used to show the reasoning intact - at least for Gemini Pro 3.1, and likely for Claude Opus 4.6 (not 100% certain about it). I have some recollection of stopping the models mid-turn when they started going astray.
As a power user, I find reasoning fascinating to read and genuinely useful at times. Probably not that useful for 80% of their base.

by msp26

1 subcomments

> Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse.
> preventing misuse.
Imagine not being able to read the tokens you are paying for.

by _fat_santa

1 subcomments

IMHO I've never found the entire reasoning chain that particularly useful for my work. For me having a summary is honestly better from a context management perspective. I understand why they would encrypt it though, because those reasoning chains are VERY useful if you're distilling the model.

by adi_pradhan

1 subcomments

Not surprised at this. The questoins for enterprises are + where can you depend on a black box as a service? + what evals and observability do you need to deploy a black box as a service confidently? + what's the ROI (considering a total footprint of people, token spend, infrastructure, service, ops etc.)
The LLM providers will clearly evolve to be more and more opaque as their services get more capable. The frontier models may even be provided as purely internal advisor or async only so they can monitor your CoT and final answers for cyber etc.

by razodactyl

0 subcomment

Heh. Summarising allows the benefit of full intelligence whilst preventing "misuse". Where "misuse" is likely competitors stealing thinking traces. Even though this is clearly work inspired from the OpenAI Strawberry era.

by HarHarVeryFunny

0 subcomment

This is nothing new - these companies don't want their model's output to be useful for distillation/training, so they just give a "summary" of its thinking steps rather than the actual sequence.
RL (the basis of LLM "thinking") is a pretty crude way to achieve the appearance of reasoning given that it reinforces all the steps, including missteps, that got it to a reward. Providing a summary could be seen as form of sane-washing, making the model look more purposeful and directed than it really is!

by reliablereason

5 subcomments

Is the thinking even done in real tokens? I thought it was done using the pure residual stream. That is instead of collapsing the residual stream to a token you treat the final layers output as a vector of size d_model and use that as input for the next position in the transformer.
If that is the case thinking is not visible to us as users due to it not being done in text.

by linsomniac

2 subcomments

I feel like I get a lot of what this article presents as "hidden" by using this process:
- "Read `description` and create a specification, implementation guide, and checklist." - "Ask clarifying questions. If any of those questions has a clear best recommendation, please select that yourself and record that in "autorecommendations.md". - "Have codex and antigravity review each of these and work to consensus."
These are the core of ~61 lines of prompting I do across 3 prompts, and I feel like the resulting artifacts describe some of the thinking. Also, some of the back-and-forth between the models feels like it gives some insight into the model "thinking".
I will say: I heavily used Fable when it was available; Opus + loops + codex and/or antigravity review is better than Fable at building things.

by wxw

0 subcomment

This seems to be the middle ground between 1) omit all reasoning to protect “trade secrets”/prevent distillations and 2) show all reasoning.
I do miss the days when reasoning was visible. Another point for open source models!

by gmerc

0 subcomment

It’s an anti distillation effort. They are scared.

by thr0w4w4y1337

0 subcomment

Caught this one on may 10th, read last 3 sentences: https://imgur.com/a/oTr5Pcc
> You've provided the current rewritten thinking and the guidelines, but I don't see the "next thinking" content that I should be rewriting. Could you provide the next thinking that needs to be rewritten?
These sentences are completely unrelated to the actual conservation

by KronisLV

1 subcomments

> I’m underwhelmed by how Anthropic is presenting the behavior of their application. If you ever need a record of the logic a used by YOUR AGENT during a session.
Nope, not your agent, if you're not running it locally. You just get to use it in whatever way they allow (also see the whole OpenClaw backlash and claude -p changes), unless there'd be regulation and laws around this (which there aren't and would be lobbied against anyways).
> Getting the full thinking output requires an enterprise agreement.
If you truly need it, then that's a (costly) option. Seems like they're largely doing this to prevent other AI foundries from doing as much distillation and stealing their CoT output en masse.
Luckily more open models don't generally do that.
Edit: If you still need something decently capable in the cloud, I’d suggest GLM, DeepSeek, MiMo or Kimi or Minimax, maaaybe sometimes Mistral for a simple EU subscription. Or look at all the pay-per-token options on OpenRouter, though be mindful of quantization.
For running something locally Qwen 3.6 35B A3B is presently a decent starting point but it will be rather limited, either way you can look up the Unsloth quants on HuggingFace for something like llama.cpp or Ollama or LM Studio.
All will work with OpenCode and Kilo Code, and most other tools. Can also try with Claude Code, I made a tool for that too: https://ccode.kronis.dev/ (or just set the env variables and maybe some aliases for something close enough), but frankly OpenCode is nice nowadays.

by purpleidea

0 subcomment

Concise and spot on. I learned about this "thinking" stuff not too long ago, and I was quite surprised that they keep it hidden. Long-term this isn't going to fly. I hope we get truly open models going and let them be owned by society.

by sigmar

1 subcomments

>the language in the docs is awfully indirect.
writes this^ and then proceeds to highlight a bold title from the docs that says "summarized thinking" that explains things clearly in the first sentence. lol

by nja

0 subcomment

Claude Code 2.1.68 seems to have been the last version (before the "ctrl-o" debacle) which actually shows thinking inline. That + Opus 4.6 has been working great as a daily driver for me... all the new "safety" / "preventing misuse" pain points in the newer models and harnesses are so frustrating in comparison.

by implexa_founder

0 subcomment

you have been asking about "extended thinking" from a machine that has been "dreaming". good luck!

by andai

0 subcomment

Aren't the actual reasoning tokens already surprisingly divergent from the models' actual thought process? I've seen at least three separate studies on that subject.

by drdexebtjl

0 subcomment

I’ve been using OpenCode with GPT models a lot, and it always shows what it is thinking. Is that also a summary? Codex doesn’t seem to have these, even with the same models.
It’s much harder to understand _why_ a model chose a particular approach in Claude Code. Especially because Claude will happily give you hallucinated reasons if you ask in retrospect.
Recent anecdote:
I was reviewing a colleague’s PR and Opus 4.8 decided to write the new feature in a completely new module. It was unnecessarily complex. We had a hard time understanding why it chose that, and it told us that it was so we could eventually deploy it as a separate micro-service and test it independently. What?
Only after being more a lot more specific about the implementation and spending a lot more tokens, it flat out refused to simplify the code with the actual reason. It turns out a line recently added to CLAUDE.md was making it incorrectly think that the module it was originally supposed to modify was legacy code that it was forbidden to extend.
This would have been caught immediately if we could inspect its thinking process.

by a-dub

0 subcomment

i wonder if it's about protecting it from extraction/distillation or if it's about not having to answer for surface that hasn't been properly vetted for public consumption. (ie, is someone going to sue them or complain or write blog posts because the thinking has transient things that people don't like where the final result is what is actually vetted?)

by sometimelurker

0 subcomment

of course its a summary of the CoT, there's so many reasons I can think of from both business (anti-distillation from china) and safety (users might `thumbs-up` or thumbs-down a conversation differently depending on the CoT, putting unreliable optimization on the CoT to seem some way.
this is really really not that bad at all

by topranks

0 subcomment

This is to frustrate those using distillation techniques to train their own models right?

by runeblaze

0 subcomment

tbh the summarized thinking with encrypted raw thinking is there for many purposes; it is there to:
1. make distillation much harder
2. safety: prevent modifications to the thinking leading to injection attacks.
3. also honestly sometimes the model raw thoughts can be deranged and is not a good user experience (consider the varied audience in the market, etc.)
also often the mass underestimate/the model makers over-estimate how people love distilling models

by timnetworks

1 subcomments

if you save a jpeg as a bitmap, doesn't that save every bit faithfully? is the example backwards or is my understanding of maps of bits naive?

by root_axis

0 subcomment

Research shows that even the raw trace tokens do not actually reflect underlying model "thoughts".

by jauntywundrkind

0 subcomment

There was a little spontaneous outbreak of joy in the GLM vs Opus thread about GLM's willingness/ability to say what it's seeing. https://news.ycombinator.com/item?id=48628464
In further reflection it is such a great indignity & such a collosal barrier to working with the machine that it insists on being a black box. The disingenuity of the American models (small print: except AI2 & some other labs; you all are so great) is a massive disadvantage to their use,... and a massive slap in the face.
It's a threat to human intelligence that it is not co-participative. Walking further into my own judgement and feelings: the insistence on being an opaque black box, the Seals Chinese Room, is such a vicious harm to society! This is civilizationally an unsafe form of AI that probably should be outlawed as anti-social. It's an impermissible asymmetry, a crippling dependent relationship to be forced into. I'm working myself up, but here: this.. imo, this is not just indignity, is harmful, it is evil.
This "6 month behind" trend we've seen for open models feels like at some point will be less important than simply the models unwillingness to speak for itself & to be observable.

0 subcomment

by _fzslm

0 subcomment

Cat and mouse measures like this rarely work forever.

by simianwords

1 subcomments

Wait I think there are 2 levels of summary. Anthropic is definitely not showing its real thinking even with enterprise agreements. For example in Claude.ai the thinking traces are not real and are themselves summaries.

by jerf

0 subcomment

AIUI it's fairly well established that the models can be saying one thing and "really" thinking another anyhow. The ones I recall seeing traced how simple one-digit arithmetic was done in the chat versus the actual activations under the hood. Tracing a real, non-trivial task through that way would be challenging, and I'd expect it is unlikely that the reasoning would say one thing while some utterly unrelated actual thought process is happening below, but I would expect that there might be a lot of places where the text of the reasoning diverges from what is "actually" being done. I'm not sure the full reasoning readout would produce much real insight anyhow.
I suspect that in some decades, as other architectures are found and used, that the inability of an LLM to "think" without also emitting a token will be seen as one of their fundamental limitations.

by micromacrofoot

0 subcomment

well yeah I wouldn't want anyone to read my unsummarized thinking either

by philipwhiuk

4 subcomments

To be honest I thought the 'thinking' was the model being asked 'how did you come up with that' and then it generating a plausible explanation. I know at one point this was correct.
Humans somewhat do the same - something that's been demonstrated in split-brain experiments.

by tsunamifury

1 subcomments

It’s not surprising than the Sota model makers core goal is to get user dependent while denying them increasing amounts of understanding of how it works to form a deeply unhealthy dependency.
Tell me this. If you hired a junior engineer or designer who refused to explain their thinking on their code and how they solved for the spec what would you do?
(That being said the reasoning output is still a summary of the Kvcache)

by bpodgursky

3 subcomments

The full thinking logs are also a summary of a thinking process presumably consistent with one necessary to generate the provided answer. Nobody really understands how LLMs think. Thinking logs seem to be accurate, and summary thinking logs seem to be a good summary of the full thinking logs.
If it's useful, it's useful, enjoy. If you aren't comfortable with that, don't use LLMs. You aren't going to get a mathematical proof of your output, just learn to be comfortable with that, or opt out and be a goat farmer.

by nekusar

1 subcomments

Yep, its basically a scam to charge you more tokens and provide less compute.
You cant even guarantee WHAT model you get. Or if they downgrade you. Or if you 'offend corporate sensibilities' and they misdirect or lie.
The only way to get good returns on a model is to run it yourself. Quit paying for corporate bullshit.

0 subcomment

by poppafuze

0 subcomment

post title checks out

by apothegm

0 subcomment

Slashdotted.

by ur-whale

1 subcomments

When you have no moat, you have to try and find desperate ways to manufacture one.

by ForHackernews

0 subcomment

Whatever it says is not always what it is doing https://transformer-circuits.pub/2025/attribution-graphs/bio...
> The computation we can see looks like it’s just guessing the answer, despite the chain of thought suggesting it’s computed it using a calculator.
It might be hallucinating or lying, it's not like you are actually observing the internals of the model.

by yuvrajsa

0 subcomment

[flagged]

by sarracin0

0 subcomment

[flagged]

by impartshadow

0 subcomment

[flagged]

by earningedged

0 subcomment

[flagged]

by codelong888

0 subcomment

[flagged]

by akitowerns

0 subcomment

[dead]

by cawksuwcka

0 subcomment

[dead]

by Rekindle8090

0 subcomment

[dead]

by josefritzishere

1 subcomments

AI does not think. It is a word guessing machine. Anthropomorphizing technology does not add anything to our understanding.

by fieldcny

0 subcomment

duh.
Computers don’t think they process, those are very different activities.

by wqaatwt

1 subcomments

Is this some new revelation? That was well known when the first OpenAI/Anthropic “thinking” models came out.

by isodev

1 subcomments

I hope it doesn't come as a surprise to anyone - LLMs don't really "think".