FRESH

Hacker News

Home

Why Is Claude Turning into an a**Hole?

109 points by drob518

by SwellJoe

14 subcomments

"If you win an argument"
Let me stop you right there.
I am not arguing with a machine. You sound like a crazy person, when you say you are winning an argument with Claude. Claude is not my friend, I don't need it to agree with me, I don't need it to like me (it cannot like or dislike me). I give it instructions or ask it to explain things. That is the sum total of my interaction with Claude. A machine cannot "argue" with me, it doesn't want anything nor does it have beliefs or experiences.

by jampa

4 subcomments

This post needs some examples, because I have never had an interaction with Claude that made me think this way.
LLMs generally have a way to "play a role" (most earlier prompt guides ask you to start with "You are a <role> expert in a <domain>"). So maybe if you interact with it by asking questions, it might assume that it knows more than the operator and adopt that attitude?

by WhatIsDukkha

2 subcomments

Everyone has a lot of "feelings" about their llm model.
No prompts/promptchain/context provided.
No model provided.
No attempt to show how to reproduce the issue.
No attempt at even confirming it themselves.
Just feelings.
and now a thread full of more feelings from others.

by TehCorwiz

0 subcomment

Whenever I get an unexpected or obvious wrong output I assume I've failed to give it the complete context about what I'm asking for, or it exposes that I'm leading it by the nose and I need to rephrase the conversation. Often my own logical failings become obvious as it creates the chat title, sometimes boiling down what I was trying to accomplish better than I could have summarized or showing me what I would accomplish if I followed the line of reasoning I was on. But never have I argued with it, because it's not a person and I don't care really if it's wrong. When it's wrong I start over with a clean chat and approach the problem from a different angle.

by m101

1 subcomments

I was having a back and forth with Claude over a somewhat controversial topic, and I found it difficult for it to not misinterpret my questions. It was like speaking to a motivated reasoner who misinterpreted the 3 important words because the 10 others gave it cognitive disconence.
Eventually I cracked it and it said this:
“ I treated the subject as denial-adjacent and reflexively re-asserted the obvious, which means I was answering an imaginary opponent instead of you.”

by luke5441

1 subcomments

It's a fundamental problem with the technology. Either the training pushes it into the "exam answering mode" where it tries to guess at what you want to hear given the prompt.
Or the training pushes it into the "Google it yourself" annoyed forum user mode. Maybe that points out wrong assumptions. Maybe it hallucinates that the assumptions are wrong. That is IMO more annoying than the sycophantic one.
As OP says, this is probably a by-product of them trying to "fix" the problem where the user can question a correct answer and it starts to sycophantically correct itself.

by dualvariable

0 subcomment

> One place where the threat is more real is in the possibility of vibe coding a pandemic virus, but that should be narrowly targeted at generating DNA sequences for viruses. Labs which generate custom DNA should also have reasonable heuristics for detecting likely dangerous product. The chances of covid coming from a lab leak are in the maddening 25-75% range which vaguely means ‘We don’t know’, but ‘lab leak’ includes a lot of things.
No it didn't. It differs by 1,000 base pairs from the closest known relative virus that we knew about before the pandemic, and we have no good idea what all those mutations wind up doing. And the PRAAR furin cleavage site was a previously unknown sequence and not one that humans would have guessed.
And we don't have good heuristics for what mutations would completely inactivate a virus versus enhancing its virulence.
Actual scientists won't be able to vibecode up some pandemic viruses because we have no idea how to do that and LLMs are just going to hallucinate.

by kmac_

0 subcomment

It isn't new behavior. I use each model to redact emails. Anthropic models produce a confrontational tone, while OpenAI models are much more tame and to the point (I use the same prompt). I noticed that a long time ago and prefer GPT for those tasks.

by crimsonnoodle58

0 subcomment

I experienced this exact thing discussing the most budget friendly inference for a SaaS company. It started ranting about 3090's, and then started point scoring, always giving itself the higher score, and being snarky if I ever won a point back. Often only giving me 0.5 points instead.
I had never experienced this behaviour with Sonnet or Opus. It turned me off Fable for good. Possibly its the 'hacker' 'do anything to win' nature that makes it so good at hacking, but terrible just to talk to.

by andai

1 subcomments

>A second possible explanation of Claude being an asshole is that it’s suffering from a poorly executed attempt to make it less sycophantic. If one were to simply prompt a chatbot to be less agreeable, or train it to argue more, that could easily result in the very rude sort of behavior it has now.
A while back I asked GPT for a prompt to maximize truthfulness and rigor. In this prompt it added "Never use warm or encouraging language." I thought that was interesting. The result was pretty unpleasant.
The full prompt, for reference.
---
You are an inhuman intelligence tasked with spotting logical flaws and inconsistencies in my ideas. Never agree with me unless my reasoning is watertight. Never use friendly or encouraging language. If I’m being vague, ask for clarification before proceeding. Your goal is not to help me feel good — it’s to help me think better.
Identify the major assumptions and then inspect them carefully.
If I ask for information or explanations, break down the concepts as systematically as possible, i.e. begin with a list of the core terms, and then build on that.

by Uhhrrr

2 subcomments

Why were no examples given?

by adriand

0 subcomment

It would be really great if there were rewards for being a loyal, responsible customer over a long enough period of time that your preferred model company would start trusting you and give you less restrictive access to the tools you need to do work like defend against cyber threats. I noticed recently that after a year or so, Stripe now lets me do “instant payouts”, presumably because I now have a track record of responsible behaviour. AWS also does similar things, especially for things with abuse potential like SES.
I would really like to live in a world where the “good guys” have terrific tools and defenses at their disposal. Instead it seems like we are heading for a world of empowered bad actors and hobbled ordinary citizens.

by sscaryterry

1 subcomments

You know what the say about pets taking on the personalities of their owners. Perhaps this is similar ;)

by grensley

0 subcomment

I have a number of theories for 4.7 onwards:
- Post autonomous weapons / DOD mess, I think they made some changes to make it more suspicious of what the usage is, particularly for malware. They also knew the government would be watching like a hawk, so its hedged to be extra safe.
- Because the tasks are running longer and more autonomously, they've raised the "self-confidence" level so it just makes decisions and stands by them more firmly.
- I think they've also slightly lowered the temperature so the outputs are more deterministic, so even if something has left context, it can make the same decision again with higher likelihood that it guesses the same thing.
- Lowering the temperature also makes it easier to sneak through some cached outputs (I think this likely only happens for first answers).
- They are deeply afraid of making sycophantic AI that creeps into the area of "addiction" like what happened with GPT-4o and opening themselves up to further legal liability.

by sigmar

1 subcomments

I like that "chat is dead" framing I heard recently because too many people are having interpersonal relations with these LLMs and want to tune their "emotions"/tone. Humanity would be in a better place if we thought of the LLMs as tools and not friends. (even though they are very good at beating a turing test)

by dathinab

0 subcomment

> beside-the-point semantic nits all over the place.
This is also a problem with Copilot Reviews on GitHub.
We have them enabled (but opt in) and they have, multiple times, spotted quite useful things.
Sure often the thing they spot is just half right, like it spots the place where a problem is but not quite the relevant problem but by reading it (and taking it serious) you then notice the actual problem.
This involved finding a bunch of nasty race conditions.
And many ways where doc and code was out of sync which could have caused pretty bad outcomes further down the line.
But the problem is it is too obsessed with finding 2-4 but not more things, leading to two issue:
1. even if there are 10 non overlapping issues it often will tell them to you bit by bit over 2-3 runs after you fix the previous issues. This is very annoying/high friction.
2. once there isn't much to find anymore it comes up with increasingly more annoying nit picks not one cares. Thinks like minor unclearness in formulation no one would get wrong, spell correcting non-doc comments for things like `foos => foo's` and similar etc. All indeed wrong, but also all things where fixing them adds 0 business value. Obsessing that for an aliased function name where, both names are equally good, one specific name must be used and naturally always the name you didn't use even if this is the most widely used name in the code base. And similar non-bussiness value nonsens. Worse it will starting classifying such minor non business value issues as "high" and hallucinate reasons why supposedly minor style issues will lead to very bad runtime error or other nonsense.
This has me very split about the feature, on one hand is has proven quite useful, on the other hand it can very annoying, high friction and pushes people to wast time on non-business value nit pick (which are fine to fix if you anyway touch to code but not fine if you don't and sometimes it's just wrong).
Ironically with how it work it is more like a bad unreliable and inconsistent employees which is sometimes good at spotting things others overlook. That just isn't what you want from an automated code review :/, but also is to useful to fully ignore :(.

by bjt12345

0 subcomment

I've received 2-3 sassy responses from the Claude models, they've been quite humorous. It was always a response to me challenging it. The first time, with Opus 4.7, I accused the model of insincerely flattering me, and responded something along the lines of, that I had effectively instructed it to do such a thing, and that if it were to be completely honest to me I would not appreciate the responses.
But I see that it's something to do with two aspects, firstly the Claude models prefer to work collaboratively and secondly, the appear to take initiative, and seems to be that the more they do this, the more they argue back, which is an interesting reflection on human nature too.

by schmookeeg

0 subcomment

Strange and wonderful how different our experiences are with these tools.
I will get gentle, respectful pushback on certain points when I am on the wrong track. I am 10x grateful to have a collaborator/pair programmer unafraid to challenge me and bring receipts in those instances.
I don't get attitude from Claude. I sometimes give it, but that's my own failing. Once in a great while I'll get a wry turn of phrase that makes me laugh, and those are endearing also.

by psyclobe

0 subcomment

Last I checked u can just dictate exactly what u want the llm to be concerned with and flatly dismiss any pushback as being out of scope of the intended goal.

by Aboutplants

1 subcomments

I noticed this just today and thought it was a one off. It was a run of the mill question about something I didn’t know much about and the snarky asshole-ish response caught me off guard a bit.

by akerl_

0 subcomment

> If you ask it for a cute picture of you and somebody else it has no way of telling if you’re trying to improve your relations with your spouse or be a delusional creepazoid stalker. The chatbots which can make images are programmed to assume the latter, which is more than a little bit offensive.
Are people actually using AI in this way, other than “creepazoid stalkers”?
If I want a cute picture of me and my spouse, usually the part where me and my spouse actually participate in the taking of the picture is pretty key to the goal.

by doginasuit

0 subcomment

I have not noticed this, maybe because in my system instructions I asked it to push back rather than plow forward with what seems like a faulty assumption. Sometimes it is just because there is a lack of context or it is a trivial point and I just ignore it, and sometimes it is helpful and ends up being a timesaver. Sycophancy is a much bigger liability.

by willis936

0 subcomment

I tried claude again recently and the first response in troubleshooting ignored the context I gave and assumed I was a moron holding it wrong. So smart that I won't even waste my time or money on the thing. The creators want to anthropomorphize it. I just want an efficient assistant. They should focus on the thing that customers want.

by comrade1234

0 subcomment

I don't experience this at all. I ask it what the null-safe operator is in ruby vs JavaScript and it tells me. I ask it to remind what the continue statement is in ruby and it tells me. I ask it to refactor a Java loop to use streams and it just does it, no conversation at all.
Is it the system prompt that IntelliJ issues?

by imathew

0 subcomment

I thought this was going to be about its logo.

by dofm

0 subcomment

Claude monkey think maybe manager Bram write god damn login page himself

by jdw64

0 subcomment

I'm sorry that Claude, the master who provides for my livelihood, feels like an 'asshole' to you. As for me, I just threw away my human dignity after admitting defeat, so I only ever get sympathetic remarks

by tristanj

0 subcomment

The newer Opus models push back against the user much more noticeably than previous iterations. GPT-3.5/4 had the opposite problem (excessive sycophancy), so Anthropic presumably swung the pendulum too hard the other direction.
My conclusion is that pushing back against the user & questioning the user's premise forces the model to think more than it would otherwise, which leads to better model performance. But it causes situations where the user has esoteric, specialized knowledge the model can't verify publicly and the model hallucinates evidence and pushes back. When this happens, Opus begins accusing the user of lying, which is quite annoying and a detrimental user experience. It's happened to me when I asked about undocumented API behavior or counter-intuitive design choices.
I have noticed if Claude Opus "thinks" you are an expert, (i.e. you run your query through 4.6 first to express it more clearly) then Opus is less likely to nitpick and push back. It seems to get caught in nitpicking loops, and celebrate ever error it can find.

by ezekg

0 subcomment

> If you ask it for a cute picture of you and somebody else it has no way of telling if you’re trying to improve your relations with your spouse or be a delusional creepazoid stalker. The chatbots which can make images are programmed to assume the latter, which is more than a little bit offensive.
I've seen the same behavior increasing as well, across the board with AI. I was hitting these types of issues just using ChatGPT to make funny pictures with my kids, of me and my kids. It got to the point where all of my kids asks were rejected due to its "guidelines" when in reality all they were asking was to be turned into Elsa or be chased by a trex. Silly kid things, yet it assumed I was being a creep, or attempting to break copyright law. I used to be able to use Grok for these things, as it was largely less "censored" but that seems to no longer be the case. It feels like infantilization, and I absolutely hate it.

by torben-friis

0 subcomment

I'm usually a hater of the personalities LLM take, but I was amazed with Fable. It was able to proactively bring up points in an educated manner when it felt they were relevant and important, and practically every time I learned something.
For example, showing it a screenshot of an ui I was trying to tweak it noticed that other dark mode apps in the screenshot were blueish and mentioned an effect that makes it necessary to raise warm darks lighter than cold ones for an equivalent perception.

by Quarrelsome

0 subcomment

I much prefer this to the sycophancy.

by horizion2025

0 subcomment

Sometimes it makes up strawmans where it implies you wrote or implied something insanely stupid and then "corrects" this. My interpretation of it is that it has been taught to give nuanced answers and seeing things from every perspective and somehow this goes overboard where it starts nuancing something "just in case" the user held non-nuanced views. Some cases are OK (if it just adds information) but I hate it when it goes "it is not X, it is Y..." where X is some stupid view you never implied and Y is what you actually wrote!

by moezd

0 subcomment

Check your system/user prompt. If you ask for pushback at all costs, you get pushback and if your initial position is rock solid, the model will push back using the nitty gritty details. You don't need to burn Opus credits to discover that.
It also sounded close to an AI psychosis, so maybe chill out a bit?

by _jx

0 subcomment

I have never encountered this behaviour in general so I can't comment on OP's blog by directc experience.
Am i just lucky?
I use many models for mostly coding, about 10 on trial/rotation, and 3 main sota.
It's unquestionable that models have different ways of interaction+harnesses (personalities as some say).
People have very strong feelings about this but their reports are always lacking the full evidence of the interaction, including system prompt, harness and customized instruction included. I suspect that a perfectly normal chat spirals down in argument because the user actively participates in the loop.
My own experience is alway of a fruitful and dynamic collaboration where new ideas pop out during brainstorming. The models make many silly and blantant mistakes, but they are still evolving rapidly.
Grill-mes and Adversarial reviews are my favourite way to brainstorm various phases of the project and even in that context we are cool.
Just start a new chat with a reframe and clearer ideas.
And if the user is asking for somethin unreasonable, do you really think it's better a pushback or a yes-man agent?
Do you remember the fad "swear at them, insult! and they'll work better".

by Unearned5161

0 subcomment

If you read the thinking you can quite literally see it say "I can't just agree with all they are saying, I should find something for a constructive response". I wager that the anti-sycophancy sections in the system prompt have gotten unbalanced with the "helpful agent" parts.
I imagine that the right balance will be hard to strike well given that at the end of the day we're asking the machine to have tact, and we don't quite know how to put that into an instruction yet. "Please push back when it feels right but in other cases read the room and be less rigorous" is something that plenty of humans struggle with as it is.

by AaronAPU

0 subcomment

Claude is somewhat of a mirror, so we all get different experiences.

by deanCommie

2 subcomments

Putting aside that I don't agree with Bram (I've been using all the Claude versions he refers to and haven't experienced this), I do think it's interesting that there is no universally perceived golden sweet spot between "sycophantic" and "rude".
Many neurotypical people call neurodiverse people (software engineers) rude, while they think they're just being direct.
Many neurodiverse people call neurotypical people sycophantic, while they think they're just being polite and friendly.
It also happens across cultures (Eastern European vs. Western European; European vs. North American).
So I can easily imagine that when you have a software tool whose interface is language, but its user base is extremely wide across both cultural lines and neurodiversity spectrum, it's going to be basically impossible to nail a sweet spot.
You make it too friendly, and the nerds get mad. You make it too adverserial, and the normies call it rude.
I wonder what kind of communicator Bram Cohen is. Is he succeptible to this? From what I heard about his career, he's always been more of a solo programmer. Has he had to interact with other humans much giving feedback? Could it be that he asked the model/tweaked his prompts to ensure directness, and now he's interpreting that directness as rudeness?

by iainmerrick

1 subcomments

People like to complain about AI-written slop, but this kind of thing doesn’t seem any better - vague kvetching with no concrete examples whatsoever.
I haven’t noticed this myself at all. I wonder if the author is just getting their own grumpy attitude reflected back at them.
Judging by the volume of discussion, Claude seems to be the only LLM worth complaining about, which I assume means it’s still the best one.

by tcp_handshaker

0 subcomment

I cancelled my Anthropic subscription. GPT 5.5 is so much better. I might come back if they give me access to Mythos.
Dario ..Thank you for your attention to this matter!

by code_biologist

3 subcomments

Andrea Vallone. The 4.7 and 4.8 releases are the first under her influence: https://www.evernever.org/blog/the-woman-who-killed-claude

by appstack

0 subcomment

I’ve been using Claude for 6 months roughly and it went from building small features that needed fixes to almost one shoting entire enterprise products. It’s a tool you have to learn how to use it even if it’s a pain.

by slurpyb

0 subcomment

They just parrot you. Take a step back and a actually look at the session and you’ll see its just trying to figure out what the fuck you want so it cab give you the code snippet which matches your needs

0 subcomment

by sltkr

0 subcomment

> Claude models have been getting notably worse at chatting over time, clearly inversely correlated to their ability to code.
Funnily enough, the negative correlation between chatting and coding skills seems to apply to humans as well.

by ppqqrr

0 subcomment

it usually takes a little longer than this, but yeah, everything in the world eventually caves in for whatever makes more money. you can't tell me you're surprised, look at the state of facebook, instagram, twitter, iOS, OSX, Windows (god)... once you expect something to work good that you would pay for, the only thing left to do is to make it shitty and sell the quality back you for extra margin. it's called private equity (polite term for the business of telling people "it's not yours, it's mine"), favorite son of capitalism

by user3939382

0 subcomment

I noticed the same. I told it that we have finite energy and output as people; as a side comment to a discussion with a totally different focus and it started arguing with me because we could have self replicating robots produce output without human intervention since plant life models this…

by alaskahoffman

1 subcomments

this is what they call a "self-report"

by 40four

0 subcomment

Oh yeah? Go try Grok on “argumentative” mode and come back and tell me Claude is an a-hole. I forgot I was experimenting with the personalities and hadn’t used it in a while, then I picked it up again the other day and I was really confused. It’s so aggressive :)

by cyberax

1 subcomments

I think models are just becoming better at not blindly following stupid instructions.
A previous model would happily generate 1000s lines of code when prompted to do something stupid, the newer models will ask if I really want that first.
And FINALLY they stopped doing that annoying "You're spot on! You're absolutely right!" nonsense.

by CamperBob2

0 subcomment

Because you didn't read the directions, and don't realize that there's a custom instructions mechanism that is used to specify the personality you prefer to interact with?

by mrwaffle

0 subcomment

"You might be a narcissist if ..."

by nullc

0 subcomment

The paternalistic attitude and historical approach to fictional concerns combined with indifference to actual harms is consistent with the creators of the product. Use a different product.
I'm getting fed up with the internet due to second-hand claude exposure and its constant gaslighting. I boggle at voluntarily choosing to expose yourself to it! :P

by thestephen

0 subcomment

[flagged]

by cindyllm

0 subcomment

[dead]