evals.report
BenchmarksLabsCompareRun guides

LiveCodeBench

A holistic, contamination-free benchmark that continuously collects new competitive-programming problems from LeetCode, AtCoder, and Codeforces (released after model training cutoffs) and measures code-generation correctness via Pass@1.

CodingPass@1Higher is better

No run guide for this benchmark yet.