Question 1

What is Remote Labor Index?

Accepted Answer

The Remote Labor Index (RLI), from CAIS and Scale Labs, measures how often AI agents can complete real, economically valuable freelance projects (3D & CAD, architecture, graphic design, video, audio, data analysis, web apps, and more) at a quality a paying client would accept. Each of the 240 projects has a real client brief, input files, and a gold-standard deliverable from a paid professional; every AI deliverable is judged by human evaluators. The headline automation rate is the share of projects where the AI's work is judged at least as good as the human's. It is a agents benchmark measured by automation rate.

Question 2

What does automation rate mean on Remote Labor Index?

Accepted Answer

Remote Labor Index reports automation rate (%); higher is better. Scores are shown only within Remote Labor Index and are never averaged with other benchmarks.

Question 3

What is the top reported Remote Labor Index score?

Accepted Answer

Claude Fable 5 has the top reported score on Remote Labor Index: 16.1% (automation rate).

Question 4

Why do Remote Labor Index scores differ across runs?

Accepted Answer

Harness, scaffold, reasoning effort, and prompt setup change results, so two runs of the same model can differ. evals.report keeps each score with its run context so the differences stay visible.

Question 5

Does evals.report rank models across benchmarks?

Accepted Answer

No. Remote Labor Index scores are shown within their own metric; evals.report never combines benchmarks into a composite ranking or a single "best model".

Remote Labor Index

What this benchmark measures

What to be careful about

Frequently asked