Compact and efficient, this model punches above its weight for a 3B parameter model — handling summarization, classification, and straightforward Q&A with reasonable coherence. It moves fast and runs on modest hardware, making it practical for edge deployments or resource-constrained environments. Complex reasoning and nuanced instruction-following are where it starts to show its size limitations.
| Benchmark | Score | Type | Recorded |
|---|---|---|---|
| MATH | 1.9 | accuracy | 26d ago |
| GPQA Diamond | 2.3 | accuracy | 26d ago |
| MuSR | 3.8 | accuracy | 26d ago |
| IFEval | 13.4 | accuracy | 26d ago |
| MMLU-Pro | 16.5 | accuracy | 26d ago |
| BBH | 14.2 | accuracy | 26d ago |