Ready nowHF datasetReview neededRun guide readyPublic data
Terminal-Bench 2.1
Important command-line agent benchmark with task registry and adapter-sensitive results.
- Category
- Agents
- Owner
- Harbor / Laude Institute
- Data path
- Use page and HF rows with agent name, model, and task-set version kept separate.