Phi 3 medium 128k instruct

Name: Phi 3 medium 128k instruct
Author: Microsoft

by MicrosoftPhi 3

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released May 2024128K context≈ 96,000 words14.0B params

Phi 3 Medium punches above its weight class for a 14B model, reflecting Microsoft's research focus on training efficiency over raw scale. It handles long-context tasks with a 128k token window, making it comfortable with lengthy documents or extended conversations. The trade-off is that it can occasionally struggle with complex multi-step reasoning where larger models have a clear edge.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Long Context

Exceptional

Factual Knowledge

Phi 3 medium 128k instruct

by MicrosoftPhi 3

Open WeightModel weights are publicly available — can be downloaded and self-hosted

Released May 2024128K context≈ 96,000 words14.0B params

Phi 3 Medium punches above its weight class for a 14B model, reflecting Microsoft's research focus on training efficiency over raw scale. It handles long-context tasks with a 128k token window, making it comfortable with lengthy documents or extended conversations. The trade-off is that it can occasionally struggle with complex multi-step reasoning where larger models have a clear edge.

Capabilities

Capability scores are AI-generated based on model documentation, benchmarks, and technical specifications. Learn more

Long Context

Exceptional

Factual Knowledge

Benchmark Scores

Benchmark	Score	Type	Recorded
MATH	19.2	accuracy	4mo ago
GPQA Diamond	11.5	accuracy	4mo ago
MuSR	11.4	accuracy	4mo ago
MMLU-Pro	41.2	accuracy	4mo ago
IFEval	60.4	accuracy	4mo ago
BBH	48.5	accuracy	4mo ago

Glossary

Long-ContextThe ability of a model to process and understand very long sequences of text while maintaining coherence across distant parts of the input.Multi-Step ReasoningThe ability to break down complex problems into smaller steps and solve them sequentially, rather than jumping directly to an answer.ReasoningThe model's ability to work through multi-step logical problems and provide justified answers rather than just pattern-matching.TokenA small unit of text (a word, subword, or punctuation mark) that a language model breaks input into for processing.Training EfficiencyThe ability to achieve strong model performance while using less computational resources, data, or time during the training process.

Capabilities

Capabilities

Benchmark Scores

Use Case Fit

Glossary