evals.report
BenchmarksSourcesLabsCompareRun guides
Run guidesTool use

Official BFCL README documents install, generation, evaluation, and score output.

Benchmark
Berkeley Function Calling Leaderboard
Dataset
Not provided
Metric
accuracy

1Expected output

Use the official source links for current output format, submission steps, and benchmark-specific result files.

2Submit results

Keep source URL, source model name, benchmark version, harness, and run context attached to any reported score.

Gotchas

BFCL includes source-provided within-benchmark aggregates; label them as BFCL metrics, never evals.report composites.
Do not mix this benchmark's metric with unrelated benchmark metrics.