LabsMoonshot AI
Models 3
Kimi K2 Instruct
Kimi · kimi-k2
2025-07-11
2 results
Kimi K2.5
Kimi · kimi-k2.5
2026-01-01
4 results
Kimi K2.6
Kimi · kimi-k2.6
2026-04-01
7 results
Progress by benchmark
Show progress on
Kimi K2 Instruct
Jul 11, 2025
—
Kimi K2.5
Jan 1, 2026
73.8%
Kimi K2.6
Apr 1, 2026
76.7%
Single benchmark only
This view shows SWE-bench Verified (% resolved) only. Other benchmarks use different metrics and are not directly comparable.
Progress matrix
| Model | SWE-bench Verified % resolved | GPQA Diamond accuracy | LiveCodeBench Pro Codeforces Elo | Berkeley Function Calling Leaderboard accuracy | LiveBench score | Terminal-Bench 2.1 task success | SWE-bench Pro % resolved | DeepSWE % resolved | Humanity's Last Exam accuracy | MMMU-Pro accuracy | LMArena source-defined rating | ARC-AGI-3 accuracy | ARC-AGI-2 accuracy | FrontierMath accuracy | AIME (OTIS Mock) accuracy | SimpleQA Verified accuracy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Kimi K2 Instruct Kimi | — | — | — | 59.06% | — | — | 27.67% | — | — | — | — | — | — | — | — | — |
| Kimi K2.5 Kimi | 73.8% | 87.6% | — | — | — | — | — | — | — | — | — | — | — | 27.9% | 92.2% | — |
| Kimi K2.6 Kimi | 76.7% | 90.8% | — | — | 72.17% | — | — | 23.89% | 29.9% | — | — | — | — | 38.97% | 96.1% | — |
Scores are not normalised across benchmarks. Each column uses its own metric. Compare columns independently.