A model trained to evaluate and score the quality of intermediate steps in a solution, rather than just checking if the final answer is correct.
Multi-step reasoning, logic puzzles, mathematical problem-solving