Long-Horizon Evaluation — Glossary — ThinkLLM