i was stuck on a landing page redesign with gpt 5.5 and opus 4.6 since a couple of days. gave a fresh try with opus 4.8 and it one shotted what i was looking for
Task A landing-page redesign GPT-5.5 and Opus 4.6 hadn't cracked over several days.
Real posts from people using these models on actual prompts and tasks — praise and criticism alike, each linked to its source. Qualitative feedback, never scored or ranked.
Report tone
Report type
Topic
i was stuck on a landing page redesign with gpt 5.5 and opus 4.6 since a couple of days. gave a fresh try with opus 4.8 and it one shotted what i was looking for
Task A landing-page redesign GPT-5.5 and Opus 4.6 hadn't cracked over several days.
I think in terms of application functionality, quality, and design, Opus won … got stuck twice in a retry loop (had to prompt to self-correct).
Task Head-to-head vs Sakana Fugu Ultra: a single-file Three.js Crossy Road game.
It is great at writing - i'm using it to this day. It was good in one-shotting front-end. But agentic? … in my memory it was never a catch up in the most important and money making areas
The 5-hour limit has been exceeded, so I have to wait 4 hours. However, it kindly provides guidance … I like this one better because it is user-oriented, offering friendly guidance for beginners and general users.
It discovered 27 bugs that Fable 5 couldn't find and fixed all of them. The code quality is impeccable … it implemented about 70,000 lines of new features, resolved 4 issues, and created 7 PRs.
Task Introduced Fugu into a repo previously worked on with Claude Fable 5; ~1 hour of use.
The game was pretty bad and notably worse than GPT 5.5. … GPT 5.5 by contrast did a pretty good job and required no follow ups.
Task Asked it to build a Three.js replica of Rocket League via Codex.
In terms of model speed and performance, Fugu on Opencode won … inverted directional turn, wonky camera, no sfx, not identical to Crossy Road game.
Task Head-to-head vs Claude Opus 4.8: a single-file Three.js Crossy Road game.
For my research, Fable felt like a clear step change … I was excited about the GLM 5.2 hype and tried it; sadly it's nowhere close
Task Evaluating models for research work (alongside Fable and GPT-5.5 Pro).
Opus 4.8 in the last 48hrs is amazing … it's just very sad to go from godlike performance to barely usable some days.
It's way too convenient to make Codex handle GPT5.5Pro work, and it makes my tasks infinitely more productive.
Task Using GPT-5.5 Pro from the Codex CLI for day-to-day work.
GLM 5.2 ranks unusually high on FrontierSWE (long-horizon agentic engineering) … using it with OpenCode is also not far from the quality of Claude Code or Codex.
Task Day-to-day agentic coding with GLM-5.2 in OpenCode.
Genuinely impressed, almost shocked, at how good GLM-5.2 … is at coding. This changes things.
Having an open weight model surpass GPT-5.4 and every Gemini model is dope. That said - it's not cheap. Both Opus 4.8 and GPT-5.5 set to "medium" are cheaper and smarter than GLM-5.2
Wow Claude Fable 5 is insane!! It just recreated the 2011 game of the year … The Elder Scrolls V: Skyrim in ONE prompt.
Task The entire prompt was: make skyrim.
I have been testing DeepSeek-V4-Pro with the Pi coding agent. I am mindblown by how well it works out of the box.
Task Built an LLM wiki with an agent powered entirely by DeepSeek-V4-Pro.