Llama 3.3 70B
MetaLLMs
86.5
Performance
★ 4.4
Rating
310
Reviews
Text GenerationReasoningOpen WeightMedium
About
Meta's efficient 70B model that matches Llama 3.1 405B performance on many tasks at a fraction of the compute cost.
Strengths
Matches Llama 3.1 405B quality on many benchmarks despite being ~6x smaller. Excellent for self-hosting on consumer-grade GPUs. Strong instruction following and coding. Llama license for broad commercial use.
Specifications
- Context window
- 128,000
- Parameters
- 70B
Speed & Latency
- 80
- tokens/sec
- 300ms
- time to first token
Available On
HuggingFaceTogether AIFireworks AIGroqOllama
Features
function callingstreamingsystem messages
Performance Trend
Benchmark score trends over time for the top 5 benchmarks.
Loading history...
Benchmarks
Scores from various benchmark tests; higher is better.
| Test | Score | Percentile | Source |
|---|---|---|---|
| BigBench Hard | 81.0 | p92 | seed |
| DROP | 81.5 | p91 | seed |
| GSM8K | 90.5 | p95 | seed |
| HumanEval | 84.5 | p94 |