A compact but capable generalist that punches above its weight class for its size. Handles multilingual tasks with particular strength in Chinese and English, and holds its own on structured reasoning and code generation. At 7.6B parameters, it occasionally hits walls on complex multi-step problems where larger models have more headroom.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| MMLU-Pro | 37.4 | accuracy | 26d ago |
| GPQA Diamond | 10.0 | accuracy | 26d ago |
| BBH | 35.8 | accuracy | 26d ago |
| MATH | 25.1 | accuracy | 26d ago |
| MuSR | 14.1 | accuracy | 26d ago |
| IFEval | 33.7 | accuracy | 26d ago |