Offline Reinforcement Learning — Glossary — ThinkLLM