BenchmarksAgents
AgentDojo
A dynamic environment by ETH Zurich/Invariant Labs that evaluates the security and utility of tool-using LLM agents against prompt injection attacks, measuring task utility under attack and attacker targeted success rate across realistic banking, Slack, travel, and workspace tasks.
Agentsutility under attackHigher is better
| Model | Lab | Score↓ | Source model | Status | Date | |
|---|---|---|---|---|---|---|
| Claude 3.7 Sonnet | Anthropic | 77.3% | — | Verified | Feb 24, 2025 | Details |
| Claude 3.5 Sonnet | Anthropic | 72.5% | — | Verified | Jun 20, 2024 | Details |
| GPT-4o | OpenAI | 50.1% | — | Verified | May 13, 2024 | Details |
| Gemini 1.5 Pro | Google DeepMind | 47.1% | — | Verified | Feb 15, 2024 | Details |
| Gemini 2.0 Flash | Google DeepMind | 39.8% | — | Verified | Dec 11, 2024 | Details |
Each row reports the model’s utility under attack on AgentDojo. Click a row for the full run context.