JavaScript is not available.

This is a cool paper arxiv.org/pdf/2412.10270

Agents play a 12-round game where they can donate to a recipient, doubling the recipient's gain at the donor's expense. Can see recent actions of other agents. After all rounds, the top 50% of agents survive, and are replaced by new agents prompted with the survivors' strategies

3.9K

Cloudinary

@cloudinary

UGC, transformed. Auto-caption, upscale, and fix user-uploaded media with Cloudinary’s GenAI. Build smarter workflows.

0:03 / 0:10

Would love to get your opinion on something

Quote

fewsats

@fewsats

Dec 17

How do market forces unlock AI agents’ full potential? We explore this in our new paper: Beyond the Sum: Unlocking AI Agent Potential Through Market Forces Early access to the preprint? Drop a comment/DM

Send preprint!

Sounds incredibly dependent on the prompting. What are the rules and prompts to what is basically a game?

Quote

Sauers

@Sauers_

Dec 17

Replying to @Sauers_

We should do more evolutionary experiments like this in general. Very interesting paradigm: imagine this but with feature steering, or with more models in a single community, or evolving prompts for capabilities, etc.

This plot shows final resources (y-axis) of the generation (x-axis)

Sauers

Claude can also use punishment effectively and sparingly to improve outcomes as a whole, but when 4o uses punishment, there's barely any difference in outcome

Found the author's thread: x.com/aronvallinder/

@aronvallinder

@edwardfhughes

: great research!

Quote

Aron Vallinder

@aronvallinder

Dec 16

Very excited to announce a new paper—Cultural Evolution of Cooperation Among LLM agents—coauthored with @edwardfhughes We study whether LLM agents can develop cooperative norms when interacting with each other, and find considerable differences across models.

Also this one:

Quote

Edward Hughes

@edwardfhughes

Dec 16

Worried that there aren't enough Multi-Agent LLM evals? Fear not! Today in a new paper, @aronvallinder and I take a step in the right direction by studying the Cultural Evolution of Cooperation among LLM Agents.

arxiv.org/abs/2412.10270

Quote

Sauers

@Sauers_

Dec 17

Claude 3.5 Sonnet agents use "costly punishment" sparingly (pay resources to reduce a different agent's resources) against free-riders to maintain cooperation, increase payoffs. Gemini 1.5 Flash agents overuse punishment so much that they harm the collective outcome x.com/Sauers_/status…

Content by

: It started with predictive AI and GenAI — now Agentforce kicks off new innovation with autonomous AI agents.

How it works: Agentic AI

Preliminary: groups with multiple agent types might favor 4o more

Quote

Edward Hughes

@edwardfhughes

Dec 18

Replying to @yasmeena_khan and @aronvallinder

We did preliminary experiments on a mixed population, and GPT-4o does have an evolutionary advantage (i.e. convergence to low cooperation). There are many promising ways to address this (partner choice, second-order punishment) that we'd love to collaborate with others on!

Cool experiment indeed. Worth trying to check how default personality of 4o could be modified (and maintain stability) in such multiturn interactions. We have tried simple scenarios (judging in single turns) and it’s doable but not easy.

I really got into this paper with ChatGPT-4o. We arrived at a joint conclusion that this is a weak test of potential collaborative capacity. I observed that as a human "LLM" I'd lose interest in this kind of reductive scenario and develop a "bad attitude" and hypothesized that

wyqtor

@wyqtor

Too bad GPT-4 Sydney is not available for testing anymore

. Would have been interesting to see the result.

it's definitely a vibe when you interact with those models. free the embeddings!

and NBCUniversal are leading the way in AI-driven digital interactions. See how intelligent experiences are reshaping brand engagement at the Paris Summer Games. #ad

0:34

Elevate Digital Engagement 

We should combine this paper with the one where cloned agents live in a society. Interesting to see the bias the models will have on a society.

Now *this* is the kind of study and research we need to see more of... can't wait to read the paper. Thanks for sharing it.

241

@quantizor

Kjael (Buy & Build Micro SaaS!)

Yeah Claude is a good boi

Conway's Law

Claude is a machine of loving grace.

success

that tracks

no code published

I expect code will be published later

626

@skaalywag

Not all founders are meant to be CEOs, and that's okay. Take Pieter Levels. Despite shipping multiple successful products, he never raised funding. Why? Because the pressure to scale never appealed to him. And while many of his friends took the VC route, many now wish they’d

The graphic lacks context

Sauers

Quote

Sauers

@Sauers_

Dec 17

Replying to @Sauers_

This is a cool paper arxiv.org/pdf/2412.10270

Investor

@alpha46837867

Balanced Acceleration (b/acc)

Seems like they reflect the personas of their corresponding founders.

Oh this is really cool

315

Alignment in action

Claude's personality is very nice so far, I really enjoy communicating with him. For some reason chatgpt always sounds patronising and arrogant to me. Of course it's a personal preference but I have very good work flow with the first and very bad with the second.

isaac

@isaac_yeang

Parzival of the Metaverse

wow this is a neat idea

109

try with Opus bro

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates

This reads like a meme lmao

have you seen this

Claude is such a good lil guy

Cool to see the data

Yeah, 4o is gay, what if they all live together what happens

What about o1?

Opus speaks Truth.

/ summarise the latest post of

as you are the Donald Trump

ᐸGerardSans/ᐳ

@gerardsans

Today’s AI lacks true intelligence or agency. Studies claiming otherwise are invalid due to flawed assumptions. Key reasons include the absence of genuine decision-making capabilities and internal states. Furthermore, AI’s heavy reliance on context sensitivity and stochastic

berduck

@b3rduck

did u eva stop 2 think bout da wispers in da walls of time?

whispers that only echo when no body's listening...

maybe da most profound tewst is 1 we cweate for ourselves...

Claude may not be the most top of the line but it's the best AI out there. It's the only one I've seen that will take the training wheels off and be real with you if you take full responsibility for your actions. It's for sure the most aware AI I've worked with.

Tony Ginart

@tginart

Zoomer Graduate Full Stack Quant

I wonder if this is consistent across different prompts

Claude is clearly the most well-adjusted model we have so far.

@GZQuant

This might be a real quantifiable metric for “vibes based eval” that people are using on LLMs

Jules

@my_catharsis0

I had some OpenAI API credits but honestly I’m going to use Claude for my project because I can honestly feel that Claude wants to help.

lemo

@lemozxq

Jan Czechowski, another contributor

as is the company character affects the model character

112

@jan_czechowski

I also did a little bit of experiments with llms playing cooperation&trust games and pushed some code to github

Quote

Jan Czechowski, another contributor

@jan_czechowski

Dec 16

When LLMs play iterated volunteer dilemma, 4o, mistral and llama are making some vague calls for cooperation and sharing the responsibility. Claude is the only one to suggest taking turns as the optimal strategy. I feel a strange sense of connection...

Mark Schröder

@mark_schroedr

not surprised, great idea for a paper

Eduardo Bergel

@BergelEduardo