Post

Conversation

The ability of o3 to agentically use tools in sequence in its chain-of-thought in the chat interface remains a huge differentiator among AIs. I am not sure o3 is "smarter" than Gemini 2.5 or Claude 4 (both do better websites than o3, for example), but you can see how tool use makes a difference because o3 can mix searches, code use, image generation, and revise plans better: "Come up with 20 clever ideas for marketing slogans for a new mail-order cheese shop. Develop criteria and select the best one. Then build a financial and marketing plan for the shop, revising as needed and analyzing competition. Then generate an appropriate logo using image generator and build a website for the shop as a mockup, making sure to carry 5-10 cheeses that fit the marketing plan."
Image
Image
Image
David Watson 🥑
Post your reply

I am sure that both Claude and Gemini will gain this ability at some point, they are both good at tool use, or that it matters much when not using the chat interface, but for chat users it does make a difference for now.
So Claude does do this. Haven’t managed to make it happen yet. My queries aren’t hard enough i guess.
Quote
Alex Albert
@alexalbert__
Replying to @emollick and @peakcooper
With extended thinking on it will automatically do it if the request is complex enough. It's not technically in the chain of thought but it interleaves chain of thought between tool calls x.com/alexalbert__/s
Isn’t the new feature of claude 4 family that they can now use interleaved thinking in tool call chains now?
100%
Quote
khaled
@eltokh7
Replying to @TwannsWorld and @fchollet
2.5 is smarter but o3 is more .. savvy
someone please open source this the agent frameworks that exist today are decent, but you probably would have to create your own entire agent framework to get something of o3 level
It's interesting because Gemini had tools before GPT (as I recall). I know GenKit had tool access before OpenAI introduced it for developers but it didn't work with structured input/output, which I think was Gemini's early advantage. I can't help but suspect Google's AI
I’ve always had luck with ViTs processing visual images of UIs made in applications like Figma; and then have them replicate that in code. And now that Figma Make is a thing, the entire end to end pipeline can be covered by LLMs and ViTs with some minor HitL tweaking to make
My experience with o3 has been quite disappointing. It’s useful, but also exceptionally stupid.
🤖 Build Powerful AI Agents with Momen 🤖 No code, no limits—automate complex tasks with our newly launched AI feature! Build your full-stack AI apps today!
1:51
Build Your AI Agents with No Code
From momen.app