Conversation
"A stained glass picture of a woman in a library with a raven on her shoulder with a key in its mouth"
"An oil painting of a man in a factory looking at a cat wearing a top hat"
"A digital art picture of a child riding a llama with a bell on its tail through a desert"
4o Image Generation doesn't just pass this test, it obliterates it. We can quibble with some of the choices (is that really how a raven would hold a key? Why do the fox lips look weird in the 1st pic?) but you can't really doubt that Scott won this bet.
For the record, here's Imagen's results from 2022. Not only do these look much worse and have poor prompt adherence, the content filter was so sensitive you couldn't even generate humans. How far we've come!
Out of curiosity, I also ran these through Midjourney 6.1. They're not bad, and it's nice how varied the images are, but they clearly don't adhere to the prompts as well.
Bonus: "a red sphere on a blue cube, with a yellow pyramid on the right, all on top of a green table"
It nails it in one shot!
My main complaint is that the basketball isn't red. It's orange, like most basketballs. This is a recurring problem with AI art: if it 'knows' something that contradicts the prompt, it often violates the prompt.
I'm not really actually convinced it passes the test with that stained glass image.
It doesn't really look or behave like actual stained glass at all.
What it DOES look like is emulating the lame Photoshop filter for stained glass...
Fair critisim. Maybe Midjourney wins here. You can really see the texture and depth of the stained glass there, even though it doesn't adhere to the prompts as well.
Then again, maybe we could fix this with better prompting.
0:10
I have not! I don't think it's possible anymore to see that (I know Dall-e used to reword prompts, not sure of that's the case here).
Additionally, when they announced this, the images had a "best of 2" or "best of 8" note, indicating it generated a few and picked the best. I
try 'a horse riding an astronaut', that was a famous failure when dalle was released too
I actually saw a few other people testing this (plus 'glass of wine filled to the brim' and 'clock face displaying a specific time.'
It's able to do all of them more successfully than previous image models, but it still struggles sometimes. Took two attempts for this one.
Now ask it to generate a 13 of hearts.
(an example I was given by someone who's not fond of generative ai. Grok fails miserably.)
Not sure if I agreed that 2024 was AGI but between Gemini 2.5, grok 3 and 4.5 following instructions feels like we have multiple instances of a great theory of mind and the ingredients of super intelligence.
It will have to be settled in hindsight when AGI happened
Got them on the first try also
Really interesting to look back on these. I remember seeing it around that time, and remember wondering whether there’s much chance of getting it in the timeframe. And experimenting with SD1.5 
Introducing OPEN, the first genre-defining AAA metaverse gaming experience with top-tier IP powered by web3 technology.
Coming to . #opensoon
IMO we arguably were beyond the threshold 6 months ago
Does it need the Plus subscription? The studio ghibli thing doesn't work for me, says "the subject is against content policy" when I was just trying it out with my pfp
In fairness, I'm moderately confident that retracted the claim of victory after some pushback.
I wonder if the teams are testing these specific prompts. You know they have a list of tests prompts they are optimizing for
I'm super impressed, but I am finding that it struggles to depict molecules correctly still. Tried multiple times to get it to depict a caffeine molecule and there was always something wrong with it. Interesting!
the most striking thing about this is how precise the timeline was – we're within 2 months of the 3 year mark
There is still some work I think to be done. I had a group photo of my volleyball team with 8 people. But ChatGPT 4o refused to generate more than 6 people. I was trying to do the latest studio ghibli style thing but wanted a Haikyuu aesthetic.
Last Quarter for my team generated 74 Million Organic impressions,
the best part?
This is 7 X what we did last year,
and it was all through free product.
Here's some thought on how we accomplished this.
#1 Compounding Impressions
Compounding impressions is the 8th
"A computer scientist looking at a blackboard where a proof of whether or not P=NP is written. Realistic, high definition."
Try “wine glass filled to the brim”. None of the image models I know of so far can do it
Discover more
Sourced from across X