Multimodal models suffer from severe confidence miscalibration; training them to be honest about uncertainty and using that uncertainty to trigger verification steps significantly improves both accuracy and reliability.
This paper identifies that multimodal AI models are overconfident—they don't reliably know when they're wrong. The authors propose a training method using image noise pairs and confidence-based rewards to fix this, plus a test-time strategy that uses the model's confidence to decide when to double-check answers. Results show 8.8% accuracy improvements across benchmarks.