GenEval

Name: GenEval Benchmark Results
Creator: BAUS.AI

GenEval evaluates compositional text-to-image generation across attributes like color, shape, position, and counting.

What it measures: Compositional image generation accuracy: ability to correctly render multiple objects with specified attributes.
How it was administered: Models generate images from compositional prompts; automated evaluation checks attribute accuracy; 0-100 score.

Model rankings

Models ranked by score on this benchmark. Higher is better.

Rank	Model	Provider	Score	Percentile	Tags
1	Midjourney v6.1	Midjourney	86.5	p96	Image Generation, Proprietary
2	Flux 1.1 Pro	Black Forest Labs	85.0	p94	Image Generation, Proprietary
3