PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

Yangsong Zhang, Anujith Muraleedharan, Rikhat Akizhanov, Abdul Ahad Butt, Gül Varol et al.|March 13, 2026arXiv

Key Takeaway

By optimizing diffusion models with physics-aware rewards during training, you can generate robot motions that are both realistic and executable on real hardware without post-hoc corrections.

Summary

This paper improves AI-generated humanoid robot motions by using preference optimization to make them physically realistic. Instead of manually tweaking physics penalties, the method integrates a physics controller directly into training, teaching the motion model to generate movements that work well when converted to real robot commands.

training reasoning applications

Key Terms

diffusion-models direct-preference-optimization whole-body-controller physics-informed-neural-networks zero-shot-generalization