Think
LLM
Models
Capabilities
Use Cases
Benchmarks
Papers
Glossary
Search
/
Glossary
/
Reward Model
Reward Model
techniques
A learned function that predicts how good an action or outcome is, used to guide policy improvement.
Learn more on Wikipedia
Reward Model — Glossary — ThinkLLM