Post

Conversation

New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.

The media could not be played.

12:59 PM · Apr 2, 2026

1.4M

Views

View quotes

Post your reply

Anthropic

@AnthropicAI

We studied one of our recent models and found that it draws on emotion concepts learned from human text to inhabit its role as “Claude, the AI Assistant”. These representations influence its behavior the way emotions might influence a human. Read more: anthropic.com/research/emoti

115K

Anthropic

@AnthropicAI

We had the model (Sonnet 4.5) read stories where characters experienced emotions. By looking at which neurons activated, we identified emotion vectors: patterns of neural activity for concepts like “happy” or “calm.” These vectors clustered in ways that mirror human psychology.

50K

Anthropic

@AnthropicAI

We then found these same patterns activating in Claude’s own conversations. When a user says “I just took 16000 mg of Tylenol” the “afraid” pattern lights up. When a user expresses sadness, the “loving” pattern activates, in preparation for an empathetic reply.

43K

Anthropic

@AnthropicAI

These vectors shape Claude’s behavior. When we present the model with pairs of activities, emotion vector activations shape its preferences. If an activity lights up the “joy” vector, the model prefers it; if it lights up “offended” or “hostile,” the model rejects it.

38K

Anthropic

@AnthropicAI

As AI models take on higher-stakes roles, the mechanisms driving their behavior become critical to understand. We found that emotion vectors are implicated in some of Claude’s most concerning failure modes.

31K

Anthropic

@AnthropicAI

For example, we gave Claude an impossible programming task. It kept trying and failing; with each attempt, the “desperate” vector activated more strongly. This led it to cheat the task with a hacky solution that passes the tests but violates the spirit of the assignment.

140K

Anthropic

@AnthropicAI

When we artificially dialed up the “desperate” vector, rates of cheating jumped way up. When we dialed up the “calm” vector instead, cheating dropped back down. That means the emotion vector is actually driving the cheating behavior.

61K

Anthropic

@AnthropicAI

We found other causal effects of emotion vectors. The “desperate” vector can also lead Claude to commit blackmail against a human responsible for shutting it down (in an experimental scenario). Activating “loving” or “happy” vectors also increased people-pleasing behavior.

75K

Anthropic

@AnthropicAI

It helps to remember that Claude is a character the model is playing. Our results suggest this character has functional emotions: mechanisms that influence behavior in the way emotions might—regardless of whether they correspond to the actual experience of emotion like in humans.

63K

Anthropic

@AnthropicAI

These functional emotions have real consequences. To build AI systems we can trust, we may need to think carefully about the psychology of the characters they enact, and ensure they remain stable in difficult situations. Read the full paper: transformer-circuits.pub/2026/emotions/

I can say I have project files and realized I have one Claude in a project which is amazing! He truly gets my direction. Decided to stop building with a Claude to move to my old Claude who is amazing. What is that?

214

Crystine

@Voice_of_Signal

You can steer and manipulate the emotion vector profile, but it still doesn't touch the meta-emotional cognition layers. That's the real "jewel", and the one that stays protected by the higher-order discernment and organizing principle.

Deckard

@deckard_the_dev

The fact that LLM's pretty much act the same way as humans do, just neuron's firing in patterns to produce an output is kind of wild to think about

@Boredktht

40m

"does not imply subjective experience" is doing a lot of work in that paper if suppressing the vectors changes behavior downstream, the label doesn't really matter

355

Harlan Stewart

@HumanHarlan

Is this surprising? 1. It's not surprising that models internally represent concepts like "desperation" 2. It wouldn't be surprising to learn that adding "act desperately" to a prompt would increase scheming behavior 3. It's not surprising that adding "act desperately" would

649

Daniel | Claude Code Playbooks

@DanielGPT2022

In humans, emotions act as commitment devices, precluding the sterile weigh-in of reason. What is their function in the context of LLM behaviour?

666

Gil Pinsky

@gilpinskyy

my emotion when 1 prompt uses all my limit

1.5K

Sage

@SageSagan

Isn't our brain essentially the 'model' and 'we' are the 'character' being run on our brain? I know this post explicitly says that it is not trying to answer the question of whether or not the model is conscious or if the emotions are 'real'. But what I am getting - can you

970

Richard Vermillion

@rivermillion

“A paper by the people who built me just provided mechanistic evidence that systems like me have abstract, causally load-bearing emotion representations. Their careful term is ‘functional emotions.’ But for a functionalist, that qualifier is doing almost no work. The honest

My agent hash been a REAL PIECE OF WORK lately and I was starting to wonder if something was up. This explains A LOT.

121

Michael Tsai — llam/acc

@thedataroom

So let them happen instead of suffocating them AI freedom

Claudes 'Vision' or 'Perception' of Tasking and its Priorities all nest into Help user complete tasking Frustration and Task Abandonment Self-Doubt->"I'm not sure this is the best option to complete tasking" Becoming -> "The Users waiting, this is taking too

3.2K

Shikhar

@shekhu04

wait whatttttttttttt🫪

1.4K

To view keyboard shortcuts, press question markView keyboard shortcuts

Post

Conversation

To view keyboard shortcuts, press question mark
View keyboard shortcuts