evals.report
BenchmarksSourcesLabsCompareRun guides

LiveCodeBench Pro

A live competitive-programming benchmark that rates LLMs with a Codeforces-style Elo on fresh contest problems.

CodingCodeforces EloHigher is better
ModelLabScoreSource modelStatusDate
Gemini 3 Deep ThinkGoogle DeepMind3298Gemini 3 Deep ThinkOfficialMay 31, 2026Details
Gemini 3.1 Pro PreviewGoogle DeepMind2887Gemini 3.1 Pro PreviewOfficialMay 31, 2026Details
Gemini 3 ProGoogle DeepMind2439Gemini 3 Pro PreviewOfficialMay 31, 2026Details
GPT-5.2OpenAI2393GPT-5.2-high (2025-12-11)OfficialMay 31, 2026Details
Gemini 3 FlashGoogle DeepMind2316Gemini 3 Flash PreviewOfficialMay 31, 2026Details
GPT-5.1OpenAI2269GPT-5.1-high (2025-11-13)OfficialMay 31, 2026Details
GPT-5 highOpenAI2176GPT-5-high (2025-08-07)OfficialMay 31, 2026Details
o4-mini (high)OpenAI2092o4-mini-high (2025-04-16)OfficialMay 31, 2026Details
Gemini 2.5 ProGoogle DeepMind1769Gemini 2.5 ProOfficialMay 31, 2026Details
Qwen3 235B A22B Instruct 2507Alibaba / Qwen1673Qwen3-235B-A22B-Thinking-2507OfficialMay 31, 2026Details
Claude Sonnet 4.5Anthropic1412Claude 4.5 Sonnet ThinkingOfficialMay 31, 2026Details
GPT-OSS-120BOpenAI1299GPT OSS 120BOfficialMay 31, 2026Details
DeepSeek R1DeepSeek1284DeepSeek R1 (2025-05-28)OfficialMay 31, 2026Details
DeepSeek V3 0324DeepSeek1124DeepSeek V3 (2025-03-24)OfficialMay 31, 2026Details

Each row reports the model’s Codeforces Elo on LiveCodeBench Pro. Click a row for the full run context.