evals.report
BenchmarksLabsCompareRun guides

Claude Code + GLM 5.1

Agent systems · Agent.

1 results

Benchmark results 1

Compare this model
BenchmarkCategoryScoreMetricStatusDate
Terminal-Bench 2.1Agents58.7%task successVerifiedMay 2, 2026Details