DreamerAD: Efficient Reinforcement Learning via Latent World Model for Autonomous Driving

Pengxuan Yang, Yupeng Zheng, Deheng Qian, Zebin Xing, Qichao Zhang et al.|March 25, 2026arXiv

Key Takeaway

Latent world models can dramatically speed up RL training for autonomous driving by replacing expensive multi-step diffusion with single-step latent sampling, making imagination-based policy training practical.

Summary

DreamerAD uses a latent world model to train autonomous driving policies 80x faster than previous diffusion-based approaches. Instead of generating full images during training, it compresses the diffusion process to a single step by working with compressed latent features, enabling safe, efficient reinforcement learning on driving tasks without real-world testing.

efficiency reasoning agents

Key Terms

world-model latent-space diffusion-models reinforcement-learning grpo