For deploying reasoning models safely, combining verbalized confidence with self-consistency gives the best uncertainty estimates with minimal computational cost, but effectiveness varies significantly across domains like math versus humanities.
This paper studies how well reasoning language models can estimate their own uncertainty by sampling multiple responses and analyzing confidence signals.