SourcesReasoning
Humanity's Last Exam
High-visibility frontier benchmark with difficult expert questions.
NextManual curatedWatchlistPartial run guidePage-backed data
Source detail
Score source
Public pages and releases exist, but exact score provenance often lives in benchmark or lab pages.
Run guide
Dataset/eval access is public enough to document, but official run details vary.
How it can be used
Use only after each score row has source verification and retrieved-at metadata.
Caveat
Avoid stale scraped tables without retrieved-at metadata.