AgentDojo

Name: AgentDojo
Creator: evals.report
License: https://creativecommons.org/licenses/by/4.0/

A dynamic environment by ETH Zurich/Invariant Labs that evaluates the security and utility of tool-using LLM agents against prompt injection attacks, measuring task utility under attack and attacker targeted success rate across realistic banking, Slack, travel, and workspace tasks.

Agentsutility under attackHigher is better

Scores About Run this benchmark

No run guide for this benchmark yet.