evals.report
BenchmarksLabsCompareRun guides

Codex CLI + GPT-5.5

Agent systems · Agent.

1 results

Benchmark results 1

Compare this model
BenchmarkCategoryScoreMetricStatusDate
Terminal-Bench 2.1Agents83.4%task successVerifiedMay 1, 2026Details