Question 1

What is Epoch Capabilities Index?

Accepted Answer

A composite capability index from Epoch AI that statistically stitches together scores from 40+ benchmarks (using an Item-Response-Theory-style model) into a single saturation-resistant general-capability scale, calibrated so Claude 3.5 Sonnet=130 and GPT-5=150. It is a reasoning benchmark measured by Index.

Question 2

What does Index mean on Epoch Capabilities Index?

Accepted Answer

Epoch Capabilities Index reports Index; higher is better. Scores are shown only within Epoch Capabilities Index and are never averaged with other benchmarks.

Question 3

What is the top reported Epoch Capabilities Index score?

Accepted Answer

GPT-5.5 Pro has the top reported score on Epoch Capabilities Index: 159.3 (Index).

Question 4

Why do Epoch Capabilities Index scores differ across runs?

Accepted Answer

Harness, scaffold, reasoning effort, and prompt setup change results, so two runs of the same model can differ. evals.report keeps each score with its run context so the differences stay visible.

Question 5

Does evals.report rank models across benchmarks?

Accepted Answer

No. Epoch Capabilities Index scores are shown within their own metric; evals.report never combines benchmarks into a composite ranking or a single "best model".