BenchmarksReasoning
FrontierMath Tier 4
FrontierMath Tier 4 is Epoch AI's expansion set of 50 exceptionally difficult, original research-level mathematics problems—crafted and vetted by expert mathematicians—that can take a specialist days to solve, measuring an AI model's advanced mathematical reasoning by exact-answer accuracy.
ReasoningaccuracyHigher is better
No run guide for this benchmark yet.