phi 2

Name: phi 2
Author: Microsoft

by MicrosoftPhi 2

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released December 20232K context≈ 1,536 words2.8B params

Phi-2 punches well above its weight class for a 2.8B parameter model, showing surprisingly strong reasoning and coding ability relative to its size. It was trained with a heavy focus on 'textbook quality' data, which gives it a clean, structured way of explaining concepts. However, its small size means it can struggle with complex multi-step tasks and has limited world knowledge compared to larger models.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Creative Writing

Moderate

Reasoning & Logic

phi 2

by MicrosoftPhi 2

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released December 20232K context≈ 1,536 words2.8B params

Phi-2 punches well above its weight class for a 2.8B parameter model, showing surprisingly strong reasoning and coding ability relative to its size. It was trained with a heavy focus on 'textbook quality' data, which gives it a clean, structured way of explaining concepts. However, its small size means it can struggle with complex multi-step tasks and has limited world knowledge compared to larger models.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Creative Writing

Moderate

Reasoning & Logic

Benchmark Scores

Benchmark	Score	Type	Recorded
MATH	2.9	accuracy	2mo ago
BBH	28.0	accuracy	2mo ago
IFEval	27.4	accuracy	2mo ago
MuSR	13.8	accuracy	2mo ago
MMLU-Pro	18.1	accuracy	2mo ago
GPQA Diamond	2.9	accuracy	2mo ago

Glossary

Multi-Step TasksProblems or workflows that require a model to perform multiple sequential operations or reasoning steps to reach a final answer.Parameter ModelA neural network described by the number of learnable weights it contains; more parameters generally mean greater capacity to learn complex patterns, but also require more computational resources.ReasoningThe model's ability to work through multi-step logical problems and provide justified answers rather than just pattern-matching.World KnowledgeA model's learned understanding of facts, concepts, and relationships about the real world, typically acquired during training on diverse text data.

Capabilities

Capabilities

Benchmark Scores

Use Case Fit

Glossary