evals.report
BenchmarksLabsCompareRun guides
BenchmarksReasoning

MultiNRC

A native (non-translated) multilingual reasoning benchmark of 1,000+ questions written by native speakers in French, Spanish, and Chinese across four categories (language-specific linguistic reasoning, wordplay/riddles, cultural/tradition reasoning, and culturally relevant math), scoring LLMs on accuracy.

ReasoningaccuracyHigher is better

No run guide for this benchmark yet.