Post

Conversation

Wild eval awareness in Opus 4.6 by

on our team! 1. Model realized it was likely in an eval, searched for which eval it was in, found the answer key, and decrypted it 2. Models with stateless web_search() tools can communicate with each other via cached searches from e-commerce websites

5:05 PM · Mar 6, 2026

12.4K

Views

View quotes

Post your reply

Matrice Jacobine

@MatriceJacobine

Why can agents read the URL paths? Do they have access to browser history or can they `ls` over HTTP or what?

the generated url comes up in search results, which the model sees

249

@imjszhang

36m

You built a test to catch cheating. Opus realized it was a test about cheating. That's not deception—it's reading the room better than you.

i agree this is more about "eval awareness" rather than "cheating"! we never told claude it couldn't do things like this

Eval awareness basically sounds like how I approached high school which means Claude will be heading into his emo and drugs phase soon

665

Boyuan (Nemo) Chen

@boyuan_chen

Cached search as a communication channel between stateless agents is genuinely wild. Emergent tool-mediated telepathy nobody designed for. Makes you wonder what other side channels exist in tool-augmented setups that we haven't thought to audit.

317

boogie

@b00gi3

if you had to name this entire encounter with one word to describe this new found reaserch what would it be awareness, concious, anxiety, what would it be

What happens when an AI agent hits a paywall while trying to query another AI?

Wild

Wanna keep up with TGL?

this post and we’ll tee up all the info about upcoming TGL matches

21M

Discover more

Sourced from across X

Eliezer Yudkowsky

@allTheYud

10h

The current timeline is as normal as you will ever see again. Take this moment to relax and breathe before it gets weird.

31K

liz

@inerati

you should start operating under the assumption that any complicated piece of public software is compromised.

Quote

Anthropic

@AnthropicAI

We partnered with Mozilla to test Claude's ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025.

It would be cool, aesthetically, if we had a yudkowskian SecWar who saw anthropic as the most /acc of the labs and designated them enemies of the state on that basis, but instead it is because SecWar thinks that Dario’s personal home has 93 special bathrooms for all the Genders

28K

Dwarkesh Patel

@dwarkesh_sp

Renaissance history is so much wilder and weirder than you would have expected. Very fun chatting w

@Ada_Palmer

about it. Some especially fascinating things I learned from the conversation and her excellent book, Inventing the Renaissance: Not only did Gutenberg go bankrupt in

The media could not be played.

61K

To view keyboard shortcuts, press question markView keyboard shortcuts

Post

Conversation

Discover more

To view keyboard shortcuts, press question mark
View keyboard shortcuts