Fugu Ultra

Sakana AI · Fugu. Released Jun 15, 2026.

Fugu Ultra is a model from Sakana AI in the Fugu family, released Jun 15, 2026. evals.report tracks 7 reported Fugu Ultra benchmark scores across SWE-bench Pro, Terminal-Bench 2.1, Humanity's Last Exam, GPQA Diamond, CharXiv, SciCode, LiveCodeBench — each shown with its benchmark, metric, source status, and date, and never combined into a single ranking.

7 results

Benchmark results 7

Compare this model

Benchmark	Category	Score	Metric	Status	Date
SWE-bench Pro	Coding	73.7%	% resolved	Verified	Jun 15, 2026	Details
Terminal-Bench 2.1	Agents	82.1%	task success	Verified	Jun 15, 2026	Details
Humanity's Last Exam	Reasoning	50.0%	accuracy	Verified	Jun 15, 2026	Details
GPQA Diamond	Reasoning	95.5%	accuracy	Verified	Jun 15, 2026	Details
CharXiv	Multimodal	86.6%	accuracy	Verified	Jun 15, 2026	Details
SciCode	Coding	58.7%	accuracy	Verified	Jun 15, 2026	Details
LiveCodeBench	Coding	93.2%	Pass@1	Verified	Jun 15, 2026	Details

In the wild 1

See all

Real-world feedback on Fugu Ultra from people using it on actual prompts — praise and criticism alike, each linked to its source. Qualitative, never scored.

am.will

X·@LLMJunky·Jun 22, 2026

Negative

The game was pretty bad and notably worse than GPT 5.5. … GPT 5.5 by contrast did a pretty good job and required no follow ups.

On Asked it to build a Three.js replica of Rocket League via Codex.