BenchmarksMultimodal
MathVista
A benchmark of 6,141 examples (evaluated on the 1,000-example testmini split) that measures mathematical reasoning in visual contexts, spanning figure QA, geometry, math word problems, textbook QA, and visual QA, reported as answer accuracy.
MultimodalaccuracyHigher is better
No run guide for this benchmark yet.