evals.report
BenchmarksLabsCompareRun guides

KernelBench (Hard)

Stanford KernelBench's hardest tier: generate correct, high-performance GPU (CUDA) kernels from PyTorch reference operators, scored on the fraction of kernels that are both correct and faster than the baseline (fast₁).

Codingfast₁Higher is better

No run guide for this benchmark yet.