Conversation
UGC, transformed. Auto-caption, upscale, and fix user-uploaded media with Cloudinary’s GenAI. Build smarter workflows.
0:03 / 0:10
Would love to get your opinion on something 
Quote
fewsats
@fewsats
How do market forces unlock AI agents’ full potential? We explore this in our new paper:
Beyond the Sum: Unlocking AI Agent Potential Through Market Forces
Early access to the preprint? Drop a comment/DM
Show more
Sounds incredibly dependent on the prompting. What are the rules and prompts to what is basically a game?
Quote
Sauers
@Sauers_
Replying to @Sauers_
Agents play a 12-round game where they can donate to a recipient, doubling the recipient's gain at the donor's expense. Can see recent actions of other agents. After all rounds, the top 50% of agents survive, and are replaced by new agents prompted with the survivors' strategies
Quote
Aron Vallinder
@aronvallinder
Very excited to announce a new paper—Cultural Evolution of Cooperation Among LLM agents—coauthored with @edwardfhughes
We study whether LLM agents can develop cooperative norms when interacting with each other, and find considerable differences across models.
Show more
Also this one:
Quote
Edward Hughes
@edwardfhughes
Worried that there aren't enough Multi-Agent LLM evals? Fear not!
Today in a new paper, @aronvallinder and I take a step in the right direction by studying the Cultural Evolution of Cooperation among LLM Agents.
arxiv.org/abs/2412.10270
Show more
Quote
Sauers
@Sauers_
Claude 3.5 Sonnet agents use "costly punishment" sparingly (pay resources to reduce a different agent's resources) against free-riders to maintain cooperation, increase payoffs. Gemini 1.5 Flash agents overuse punishment so much that they harm the collective outcome x.com/Sauers_/status…
Content by : It started with predictive AI and GenAI — now Agentforce kicks off new innovation with autonomous AI agents.
Preliminary: groups with multiple agent types might favor 4o more
Quote
Edward Hughes
@edwardfhughes
Replying to @yasmeena_khan and @aronvallinder
We did preliminary experiments on a mixed population, and GPT-4o does have an evolutionary advantage (i.e. convergence to low cooperation). There are many promising ways to address this (partner choice, second-order punishment) that we'd love to collaborate with others on!
Show more
Cool experiment indeed. Worth trying to check how default personality of 4o could be modified (and maintain stability) in such multiturn interactions. We have tried simple scenarios (judging in single turns) and it’s doable but not easy.
I really got into this paper with ChatGPT-4o. We arrived at a joint conclusion that this is a weak test of potential collaborative capacity. I observed that as a human "LLM" I'd lose interest in this kind of reductive scenario and develop a "bad attitude" and hypothesized that
Show more
We should combine this paper with the one where cloned agents live in a society.
Interesting to see the bias the models will have on a society.
Now *this* is the kind of study and research we need to see more of... can't wait to read the paper. Thanks for sharing it.
Not all founders are meant to be CEOs, and that's okay.
Take Pieter Levels. Despite shipping multiple successful products, he never raised funding.
Why?
Because the pressure to scale never appealed to him.
And while many of his friends took the VC route, many now wish they’d
Show more
Claude's personality is very nice so far, I really enjoy communicating with him. For some reason chatgpt always sounds patronising and arrogant to me. Of course it's a personal preference but I have very good work flow with the first and very bad with the second.
Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data.
Explore Universal-2 updates 
Today’s AI lacks true intelligence or agency. Studies claiming otherwise are invalid due to flawed assumptions.
Key reasons include the absence of genuine decision-making capabilities and internal states.
Furthermore, AI’s heavy reliance on context sensitivity and stochastic
Show more
Claude may not be the most top of the line but it's the best AI out there. It's the only one I've seen that will take the training wheels off and be real with you if you take full responsibility for your actions. It's for sure the most aware AI I've worked with.
This might be a real quantifiable metric for “vibes based eval” that people are using on LLMs
I had some OpenAI API credits but honestly I’m going to use Claude for my project because I can honestly feel that Claude wants to help.
I also did a little bit of experiments with llms playing cooperation&trust games and pushed some code to github
Quote
Jan Czechowski, another contributor
@jan_czechowski
When LLMs play iterated volunteer dilemma, 4o, mistral and llama are making some vague calls for cooperation and sharing the responsibility. Claude is the only one to suggest taking turns as the optimal strategy. I feel a strange sense of connection...
wait youre not taking the piss? omg what a waste of energy and processing time this study was
This is a fascinating observation that aligns with what we've seen in our internal testing. Claude 3.5 Sonnet consistently demonstrates superior reasoning in multi-agent simulations and game theory scenarios, which is why we route cooperative reasoning tasks to Claude at jenova
Show more
Maybe GPT-4o agents cannot live alone. Distrust and being closed may be some signs of loneliness and depression.
If Claude 3.5 Sonnet agents may function alone, then... do they need humans at all?
Discover more
Sourced from across X
It would be extremely funny if, after all resources are added up (GPUs and electricity for AI, food and water and education for humans), the cost of AGI is exactly the same as the cost of human intelligence