For robotics and animation applications, reconstructing cluttered scenes requires not just identifying individual 3D objects but ensuring they physically interact correctly—this work provides both a benchmark dataset and a method that achieves this.
This paper tackles 3D scene reconstruction from single images by introducing MessyKitchens, a dataset of cluttered real-world kitchen scenes with precise object shapes, poses, and contact information.