evals.report
BenchmarksLabsCompareRun guides
xAIGrok

Grok 4

xAI · Grok. Released Jul 9, 2025.

5 results

Benchmark results 5

Compare this model
BenchmarkCategoryScoreMetricStatusDate
FrontierMathReasoning19.66%accuracyOfficialMay 30, 2026Details
Berkeley Function Calling LeaderboardTool use62.97%accuracyOfficialApr 12, 2026Details
GPQA DiamondReasoning87.0%accuracyOfficialMay 30, 2026Details
Humanity's Last ExamReasoning24.52%accuracyOfficialMay 31, 2026Details
SimpleQA VerifiedOther47.9%accuracyOfficialMay 30, 2026Details