Post

Conversation

With 4o Image Generation, it's time to revisit Scott Alexander's bet against Vitor. Although Scott declared victory back in Sep. 2022, some commenters (myself included) felt this was dubious. For each prompt, I'll generate exactly four pictures and post all of them.🧵
Image
David Watson 🥑
Post your reply

4o Image Generation doesn't just pass this test, it obliterates it. We can quibble with some of the choices (is that really how a raven would hold a key? Why do the fox lips look weird in the 1st pic?) but you can't really doubt that Scott won this bet.
For the record, here's Imagen's results from 2022. Not only do these look much worse and have poor prompt adherence, the content filter was so sensitive you couldn't even generate humans. How far we've come!
Image
Out of curiosity, I also ran these through Midjourney 6.1. They're not bad, and it's nice how varied the images are, but they clearly don't adhere to the prompts as well.
Image
Image
Image
Image
Bonus: "a red sphere on a blue cube, with a yellow pyramid on the right, all on top of a green table" It nails it in one shot!
Image
Image
My main complaint is that the basketball isn't red. It's orange, like most basketballs. This is a recurring problem with AI art: if it 'knows' something that contradicts the prompt, it often violates the prompt.
⌨️ 2AM debugging session: Your code won't compile, Stack Overflow is your best friend, and your Kaiser 4 is the only one who understands... Every developer's late-night reality: 💻 "One more bug fix" turned into sunrise ☕ Coffee cup collection growing 🐛 That bug that just
Image
I'm not really actually convinced it passes the test with that stained glass image. It doesn't really look or behave like actual stained glass at all. What it DOES look like is emulating the lame Photoshop filter for stained glass...
Fair critisim. Maybe Midjourney wins here. You can really see the texture and depth of the stained glass there, even though it doesn't adhere to the prompts as well. Then again, maybe we could fix this with better prompting.
🚀💺"Epic Moment! Kaiser 4 Gaming Chair Takes Over Times Square!" AndaSeat delivers unparalleled comfort!🕹️✨ The AndaSeat Kaiser 4 takes over Times Square! Not just an ad – this gaming chair is redefining the standard of comfort! From top-notch stain resistance to 5D
0:10
Have you confirmed that these prompt texts are what the image generation tool is directly receiving? Remember: ChatGPT filters your prompts and rewords them. It's possible to bypass this filtering with a simple jailbreak.
I have not! I don't think it's possible anymore to see that (I know Dall-e used to reword prompts, not sure of that's the case here). Additionally, when they announced this, the images had a "best of 2" or "best of 8" note, indicating it generated a few and picked the best. I
I actually saw a few other people testing this (plus 'glass of wine filled to the brim' and 'clock face displaying a specific time.' It's able to do all of them more successfully than previous image models, but it still struggles sometimes. Took two attempts for this one.
Image
Image
Not sure if I agreed that 2024 was AGI but between Gemini 2.5, grok 3 and 4.5 following instructions feels like we have multiple instances of a great theory of mind and the ingredients of super intelligence. It will have to be settled in hindsight when AGI happened
Got them on the first try also
Quote
Lech Mazur
@LechMazur
The new GPT‑4o image generation gets all five of these prompts correct on the first try in my test! The best prompt adherence yet - we'll need harder tests. x.com/LechMazur/stat…
Image
Image
Image
Image
Really interesting to look back on these. I remember seeing it around that time, and remember wondering whether there’s much chance of getting it in the timeframe. And experimenting with SD1.5 😅
Does it need the Plus subscription? The studio ghibli thing doesn't work for me, says "the subject is against content policy" when I was just trying it out with my pfp
I'm super impressed, but I am finding that it struggles to depict molecules correctly still. Tried multiple times to get it to depict a caffeine molecule and there was always something wrong with it. Interesting!
There is still some work I think to be done. I had a group photo of my volleyball team with 8 people. But ChatGPT 4o refused to generate more than 6 people. I was trying to do the latest studio ghibli style thing but wanted a Haikyuu aesthetic.
Last Quarter for my team generated 74 Million Organic impressions, the best part? This is 7 X what we did last year, and it was all through free product. Here's some thought on how we accomplished this. #1 Compounding Impressions Compounding impressions is the 8th
Image
Try “wine glass filled to the brim”. None of the image models I know of so far can do it

Discover more

Sourced from across X
ok so images are solved. coding solved soon. creative writing is on notice. 3d modeling coming into view. lord