LiveCodeBench
Benchmark website →LiveCodeBench evaluates code generation on competitive programming problems released after model training cutoffs.
About this test
- What it measures
- Algorithmic coding ability on fresh, unseen problems to avoid data contamination.
- How it was administered
- Problems from Codeforces, LeetCode, AtCoder posted after training; pass@1 on hidden test cases.
Model rankings
Models ranked by score on this benchmark. Higher is better.
| Rank | Model | Provider | Score | Percentile | Tags |
|---|---|---|---|---|---|
| 1 | Alibaba | 42.8 | p90 | Code Assistant, Open Weight, Medium |