BenchmarksChat preference
LMArena
A public chat-preference evaluation surface with source-defined preference ratings and model comparisons.
Chat preferencesource-defined ratingHigher is better
| Model | Lab | Score↓ | Source model | Status | Date | |
|---|---|---|---|---|---|---|
| Claude Opus 4.6 thinking | Anthropic | 1499 | claude-opus-4-6-thinking | Official | May 27, 2026 | Details |
| Claude Opus 4.6 | Anthropic | 1497 | claude-opus-4-6 | Official | May 27, 2026 | Details |
| Claude Opus 4.7 thinking | Anthropic | 1486 | claude-opus-4-7-thinking | Official | May 27, 2026 | Details |
| Gemini 3.5 Flash | Google DeepMind | 1482 | gemini-3.5-flash | Official | May 27, 2026 | Details |
| Gemini 3.1 Pro Preview | Google DeepMind | 1481 | gemini-3.1-pro-preview | Official | May 27, 2026 | Details |
| Claude Opus 4.7 | Anthropic | 1480 | claude-opus-4-7 | Official | May 27, 2026 | Details |
| Gemini 3 Pro | Google DeepMind | 1479 | gemini-3-pro | Official | May 27, 2026 | Details |
| Qwen3.7 Max Preview | Alibaba / Qwen | 1474 | qwen3.7-max-preview | Official | May 27, 2026 | Details |
| Muse Spark | Meta | 1474 | muse-spark | Official | May 27, 2026 | Details |
| GPT-5.4 | OpenAI | 1472 | gpt-5.4-high | Official | May 27, 2026 | Details |
| Qwen3.5 Max Preview | Alibaba / Qwen | 1470 | qwen3.5-max-preview | Official | May 27, 2026 | Details |
| ERNIE 5.1 | Baidu | 1469 | ernie-5.1 | Official | May 27, 2026 | Details |
| GLM-5.1 | Z.ai | 1469 | glm-5.1 | Official | May 27, 2026 | Details |
| GPT-5.5 high | OpenAI | 1468 | gpt-5.5-high | Official | May 27, 2026 | Details |
| Gemini 3 Flash | Google DeepMind | 1466 | gemini-3-flash | Official | May 27, 2026 | Details |
| GPT-5.5 | OpenAI | 1463 | gpt-5.5 | Official | May 27, 2026 | Details |
| Gemini 2.5 Pro | Google DeepMind | 1457 | gemini-2.5-pro | Official | May 27, 2026 | Details |
| Claude Sonnet 4.6 | Anthropic | 1454 | claude-sonnet-4-6 | Official | May 27, 2026 | Details |
| Grok 4.20 beta reasoning | xAI | 1453 | grok-4.20-beta-0309-reasoning | Official | May 27, 2026 | Details |
| DeepSeek V4 Pro | DeepSeek | 1446 | deepseek-v4-pro | Official | May 27, 2026 | Details |
Each row reports the model’s source-defined rating on LMArena. Click a row for the full run context.