Design Arena
A crowdsourced human-preference benchmark where top AI models receive identical design/frontend prompts and users vote head-to-head on the anonymized outputs, producing a Bradley-Terry (Elo) ranking of design taste across categories like websites, UI components, games, and data visualization.
What this benchmark measures
A crowdsourced human-preference benchmark where top AI models receive identical design/frontend prompts and users vote head-to-head on the anonymized outputs, producing a Bradley-Terry (Elo) ranking of design taste across categories like websites, UI components, games, and data visualization.
Rows on this page are sourced from public benchmark artifacts, leaderboard exports, or source-linked model reports. Each row keeps benchmark version, source model name, and available run details attached to the score.
The metric shown here is Elo. It should be interpreted within Design Arena, not compared as part of a site-wide ranking.