AgentDojo

Name: AgentDojo
Creator: evals.report
License: https://creativecommons.org/licenses/by/4.0/

A dynamic environment by ETH Zurich/Invariant Labs that evaluates the security and utility of tool-using LLM agents against prompt injection attacks, measuring task utility under attack and attacker targeted success rate across realistic banking, Slack, travel, and workspace tasks.

Agentsutility under attackHigher is better

Scores About Run this benchmark

Model	Lab	Score↓	Source model	Status	Date
Claude 3.7 Sonnet	Anthropic	77.3%	—	Verified	Feb 24, 2025	Details
Claude 3.5 Sonnet	Anthropic	72.5%	—	Verified	Jun 20, 2024	Details
GPT-4o	OpenAI	50.1%	—	Verified	May 13, 2024	Details
Gemini 1.5 Pro	Google DeepMind	47.1%	—	Verified	Feb 15, 2024	Details
Gemini 2.0 Flash	Google DeepMind	39.8%	—	Verified	Dec 11, 2024	Details

Each row reports the model’s utility under attack on AgentDojo. Click a row for the full run context.