evals.report
BenchmarksSourcesLabsCompareRun guides
Run guidesReasoning

Tasks and evaluation are public; frontier scores are ARC-Prize-verified.

Benchmark
ARC-AGI-2
Repository
Not provided
Dataset
arcprize.org
Metric
accuracy

1Expected output

Use the official source links for current output format, submission steps, and benchmark-specific result files.

2Submit results

Keep source URL, source model name, benchmark version, harness, and run context attached to any reported score.

Gotchas

Public and semi-private splits differ; keep the reported effort/compute as run context.
Do not mix this benchmark's metric with unrelated benchmark metrics.