GenEval
Benchmark website →GenEval evaluates compositional text-to-image generation across attributes like color, shape, position, and counting.
About this test
- What it measures
- Compositional image generation accuracy: ability to correctly render multiple objects with specified attributes.
- How it was administered
- Models generate images from compositional prompts; automated evaluation checks attribute accuracy; 0-100 score.
Model rankings
Models ranked by score on this benchmark. Higher is better.
| Rank | Model | Provider | Score | Percentile | Tags |
|---|---|---|---|---|---|
| 1 | Midjourney | 86.5 | p96 | Image Generation, Proprietary | |
| 2 | Black Forest Labs | 85.0 | p94 | Image Generation, Proprietary | |
| 3 |