evals.report
BenchmarksLabsCompareRun guides
BenchmarksReasoning

MMLU-Pro

A more robust and challenging successor to MMLU with over 12,000 reasoning-focused questions across 14 subjects, expanding answer choices from four to ten to better discriminate frontier large language models.

ReasoningaccuracyHigher is better

No run guide for this benchmark yet.