Ready nowRaw JSONStructured dataRun guide readyMachine-readable
SWE-bench Verified
Canonical software-engineering agent benchmark already in product scope.
- Category
- Coding
- Owner
- SWE-bench
- Data path
- Official leaderboard rows and per-instance metadata can be shown with scaffold and tool context preserved.