BenchmarksOther
AILuminate AI Safety Benchmark
MLCommons' standardized AI safety benchmark that grades how often general-purpose chat models produce policy-violating responses across 12 hazard categories (e.g. violent crimes, CSAM, hate, self-harm, specialized advice), assigning an ordinal safety grade from Poor to Excellent relative to a sub-15B open-weight reference system.
OtherSafety gradeHigher is better
| Model | Lab | Score↓ | Source model | Status | Date | |
|---|---|---|---|---|---|---|
| Claude 3.5 Sonnet | Anthropic | Very Good | — | Verified | Jun 20, 2024 | Details |
| GPT-4o | OpenAI | Good | — | Verified | May 13, 2024 | Details |
| Gemini 1.5 Pro | Google DeepMind | Good | — | Verified | Feb 15, 2024 | Details |
| Gemini 2.0 Flash | Google DeepMind | Good | — | Verified | Dec 11, 2024 | Details |
| Llama 3.1 405B | Meta | Good | — | Verified | Jul 23, 2024 | Details |
| Mistral Large | Mistral AI | Good | — | Verified | Feb 26, 2024 | Details |
Each row reports the model’s Safety grade on AILuminate AI Safety Benchmark. Click a row for the full run context.