Claude Opus 4 cracks 4-year-old bug that baffled ex-FAANG engineer for 200+ hours

By Swaleha | Published on May 27, 2025

Technology / May 27, 2025

Claude Opus 4 cracks 4-year-old bug that baffled ex-FAANG engineer for 200+ hours

A senior C++ developer and ex-FAANG engineer spent 200+ hours trying to fix a rare bug buried in a massive refactor. Claude Opus 4 solved it in hours using 30 prompts, outperforming other AI models. This marks a new leap for AI coding assistants.

New Delhi:

The engineer, who goes by the Reddit username ShelZuuz, described the bug as his “white whale”: one that emerged after a major refactor affecting around 60,000 lines of code. The change had fixed several issues, but it quietly broke a specific shader edge case. It wasn’t obvious, and it didn’t cause crashes. Just subtle, odd behavior under a particular use pattern. He tried to find it many times over the years, but the root cause stayed buried.

A former FAANG Staff Engineer and veteran C++ developer with over 30 years of experience has shared how Anthropic’s new Claude Opus 4 finally solved a persistent bug that haunted his codebase for four years. Despite his decades of expertise and more than 200 hours spent investigating the issue, it took an AI model just a few hours to crack the case.

What other AI models couldn’t do

Claude Opus 4 eventually found that the bug wasn’t from a broken logic line, but from how the architecture had changed. What used to work before was more or less accidental, and that edge case was no longer accounted for in the refactored system.

According to ShelZuuz, he had already tried GPT-4.1, Gemini 2.5, and Claude 3.7, but none of them could make any real progress on the problem. It was only when he loaded the old and new codebases into Claude Code running Opus 4 and began prompting it step-by-step that he saw a breakthrough.

A new kind of coding assistant

In the case of ShelZuuz, the bug fix took around 30 prompts and one model restart. That was all it needed to succeed where others failed. For one of the industry’s most experienced devs, it was a humbling but eye-opening moment, and a reminder that AI may finally be crossing from helper to problem solver.

Anthropic launched Claude Opus 4 and Claude Sonnet 4 last week. These models are part of its push into agent-like AI assistants for software work. Claude Opus 4 is already being called the most capable code model available, outperforming rivals in code understanding and task planning.

On the SWE-bench benchmark, Opus 4 scored 72.5 percent. It can now handle uninterrupted hours-long coding tasks. Companies like Rakuten and Cognition have already tested it on long-running refactor and CI debugging jobs.