Post

Conversation

Excited to introduce Dreamer 4, an agent that learns to solve complex control tasks entirely inside of its scalable world model!

Dreamer 4 pushes the frontier of world model accuracy, speed, and learning complex tasks from offline datasets. co-led with

@wilson1yan

0:01 / 0:10

10:07 AM · Sep 30, 2025

268K

Views

Post your reply

Danijar Hafner

@danijarh

Sep 30

Enabled by imagination training, Dreamer 4 is the first agent to mine diamonds in Minecraft entirely from offline data! This setting is crucial for fields like robotics, where online interaction is not practical. The task requires 20k+ mouse/keyboard actions from raw pixels

0:02 / 2:55

Dreamer 4 learns a scalable world model from offline data and trains a multi-task agent inside it, without ever having to touch the environment. During evaluation, it can be guided through a sequence of tasks. These are visualizations of the imagined training sequences

The Dreamer 4 world model predicts complex object interactions while achieving real-time interactive inference on a single GPU It outperforms previous world models by a large margin when put to the test by human interaction 🧑‍💻

0:24

Danijar Hafner

@danijarh

Sep 30

For accurate and fast generations, we use an efficient transformer architecture and a novel shortcut forcing objective

We first pretrain the WM, finetune agent tokens into the same transformer to predict policy & reward, and then improve the policy by imagination training

Two diagrams side by side. The left diagram shows a block causal tokenizer with a block causal encoder and decoder, featuring multiple image panels and labeled components. The right diagram illustrates block causal dynamics with layers labeled as causal time layer, space layer, and interactive dynamics, including symbols like z, a, and t d.

Shortcut forcing builds on diffusion forcing and shortcut models, training a sequence model with both the noise level and requested step size as inputs This enables much faster frame-by-frame generations than diffusion forcing, without needing a distillation phase

A line graph titled "Generation quality for sampling steps." The x-axis shows sampling steps (1, 2, 4, 8, 16, 32, 64), and the y-axis shows FVD values (0 to 1000). Two lines are plotted: a blue line labeled "Diffusion Forcing" and a black line labeled "Shortcut Forcing," showing FVD values decreasing as sampling steps increase.

On the offline diamond challenge, Dreamer 4 outperforms OpenAI's VPT offline agent despite using 100x less data It also outperforms modern behavioral cloning recipes, even when they are based on powerful pretrained models such as Gemma 3

A bar chart titled "Offline Diamond Challenge" showing success rates in percentages for different agents. Bars are colored red for VPT (finetuned), blue for BC, cyan for VLA (Gemma 3), and purple for Dreamer 4, comparing their performance across tasks represented by icons like wooden planks, stone, and diamonds.

We find that imagination training not only makes policies more robust but also more efficient, so they achieve milestones towards the diamond faster

Moreover, using the WM representations for behavioral cloning outperforms using the general representations of Gemma 3

Two bar charts comparing performance metrics. The left chart shows success rates in percentages for BC (notask), BC, VLA (Gemma 3), WM+BC, and Dreamer 4, with bars in orange, red, light blue, green, and dark blue. The right chart displays time in minutes for the same agents, with bars in similar colors. Labels include "Success rate (%)" and "Time (min)".

We have come a long way since Dreamer 3, which is based on a more lightweight but less scalable RNN with variational objective While the lightweight approach still makes sense for easier tasks, Dreamer 4 allows scaling to much more diverse datasets and environments

Multiple sequences of video game screenshots from Dreamer 4 and Dreamer 3, showing first-person perspectives in blocky, pixelated environments. Dreamer 4 displays outdoor grassy areas with structures and indoor stone-walled rooms. Dreamer 3 shows outdoor landscapes with water and grassy fields. Each sequence progresses over time, depicting changes in the virtual environment.

Big thanks to my co-lead

@wilson1yan

and to

@countzerozzz

! Check out the website for videos and the paper for details & many ablations Website: danijar.com/dreamer4/ Paper: arxiv.org/abs/2509.24527 Happy to answer any questions!

arxiv.org

Training Agents Inside of Scalable World Models

World models learn general knowledge from videos and simulate experience for training behaviors in imagination, offering a path towards intelligent agents. However, previous world models have been...

I love that Minecraft is still the benchmark

There's so much more general AI progress we can make on Minecraft! The agent is still far from human-level play, and there are hundreds of harder tasks past getting diamonds

This is the biggest model I have seen in this direction of research plus main focus being minecraft. Are we done with atari phase of world models even in research ?

Danijar Hafner

@danijarh

23h

Yes, we're done with Atari

I honestly think Minecraft will be a great testbed for the next few years of agent and robotics research! There is a lot more to do

Great work and very well written paper! The driving agents by

@comma_ai

are also being trained entirely inside a world model's imagination! blog.comma.ai/mlsim

Danijar Hafner

@danijarh

18h

Thanks!! That's super cool, thanks for the link

Though I'd like to see this same system transferred to try other sims/games, I'm also interested in seeing it speak about what it's doing. It should be able to explain its actions and take directions like "make a workbench" in this case.

Yep it opens up several exciting directions! Training real robots (feasible now given the scalable world model and ability to learn offline), language input/output, long-term memory so the world state is consistent when you revisit a place much later

The "Dream to Control" is a dream come true!

Any chance you would be open to speak about it?

Would be fun!

Why is it still all Minecraft?

Minecraft is an excellent test bed for embodied agent research! There is a lot more to do over the coming years, and faster/more rigorous than hand-made agent benchmarks We also trained on real world video and are seeing the results transfer, check out the website and paper!

Huge fan for Dreamer!

Thanks!

Congrats on the new version release, Dani!

Thanks Amir!

Been loving this since dreamer 2. This is clearly the way.

GIF

Hey super nice. I have some questions: what intuition led you to use MAE instead of a VAE? Where does time compression come in if each frame has its own latent in the autoencoder?

294

PastySmasher

@PastySmasher0

Is there any examples in any other games?

Um, hi

Our team love this. Let's get in touch!

how do i try this??

that is so cool, now it is more fun to explore minecraft

Appreciate the clarity and depth in this post. Very insightful for anyone following AI.

This will be huge. Congratulations on the work!

373

ElevenLabs

@elevenlabsio

Easily create and sell audiobooks with ElevenLabs, the most realistic audio AI platform. Try for free today.

Create audiobooks using ElevenLabs

AIN'T NO WAY

wow based model

What happened to Dreamer 3?

This looks incredible. I’m gonna look for links and bio and detail details. Very cool.

what is this?

This is awesome!!

Marek Rosa | European

| South African

@marek_rosa

Amazing!

Mayank is building 3D Games on Madmods

@Mayanktweeted

This looks pretty cool

Alex Inch

@alexinch_ai

12h

So cool - and great to see such a detailed paper explaining the method.

some kind of cat

@hypotheosis_

19h

i'm really surprised it's possible to train such a good world model with the VPT dataset! i would expect that certain actions would be problematically left out of the distribution, e.g. there are probably very few or no examples of deliberately walking into lava.

this is kinda of cool ngl. can it also solve complex control tasks outside of minecraft ?

232

gelleproductions

@glproductions

Do you think something like this will ever be released publicly? (paid or free) I assume something like this could be useful for both real-world robotics but also for computer use tasks & interacting with UIs on a computer

Seth Burkart

@SethBurkart

22h

Are you planning on releasing the model/code? I'd be very interested in trying it out for myself.

476

Eren

@erennyager

its funny that AGI is trained on minecraft. let him also play on a 2b2t server

osmarks

@osmarks1

10h

Do you plan to open-source this like DreamerV3?

Lisa

@dontlistentoliz

Bro will do bastion routes and zero cycles before we even know it

Sean

@realSean_Murphy

19h

You should probe this AI agent closely. Imagine: it's not just smart—it's sly. Give it a task and log its every method and motive. My biggest concern: it's faking alignment.