When Anthropic introduced its new Mythos model in April, it also issued a stern warning to anyone developing software. The model was so powerful at detecting software vulnerabilities, the lab claimedthat it had discovered thousands of high-severity bugs that should have been fixed before they were made public.
Now, security researchers for Mozilla’s Firefox browser are providing a closer look at what this process looked like in practice and what the Mythos powers mean for software security in general.
In a post published on ThursdayMozilla said that Mythos has discovered numerous high-severity bugs, including some that have been dormant in the code for more than a decade.
This is a significant improvement over what AI security tools were capable of even six months ago. Until now, AI debugging tools have had serious drawbacks, often inundating security teams with low-quality reports and false positives. But Mozilla researchers say the latest generation of tools has turned the corner, particularly now that agent systems can evaluate their work and filter out bad results.
“It is hard to overstate how much this dynamic has changed for us in just a few months,” the researchers wrote. “First, the models have become much more capable. Second, we dramatically improved our techniques utilization these models”.
The results are impressive: In April 2026, Firefox shipped 423 bug fixes, compared to just 31 exactly a year earlier. The researchers also released details of 12 bugs, ranging from a pair of unusual sandbox vulnerabilities to a 15-year-old bug in the way the browser parses an HTML element.
“These things are actually suddenly really good,” Brian Grinstead, a distinguished engineer at Mozilla, told TechCrunch. “We see it in our own internal scanning, we see it in our external bug reports, and we see it in all kinds of signals across the industry.”
Techcrunch event
San Francisco, California
|
13-15 October 2026
The fact that the system helped uncover vulnerabilities in Firefox’s “sandbox” system is particularly impressive given how sophisticated an exploiting attack must be. To find sandbox vulnerabilities, the model has to write a hacked patch for the browser and then attack the most secure part of the software with the newly applied code. Finding and demonstrating the bug is a delicate, multi-step process that requires creativity and great care.
To put this in context, Mozilla’s bug bounty program pays researchers who can find a bug in the Firefox sandbox up to $20,000 — the highest reward available. Despite the top-dollar reward, however, Grinstead says Mythos finds more sandbox issues than human researchers ever did. “We’re getting them,” he told TechCrunch, “but not in the volume we can find with this technique.”
Notably, the Firefox team still doesn’t use AI to fix bugs, despite well-documented progress in AI coding tools. The team asks the AI to code patches for each bug, but the resulting code usually can’t be developed directly and instead serves as a model for a human engineer.
“For the bugs we’re talking about in this post, each one is an engineer writing a patch and an engineer reviewing it,” Grinstead says. “We haven’t found it to be automatable.”
It is not yet clear how the emerging capabilities of artificial intelligence will change the broader balance of power in cyber security. A month since the Mythos preview, most of the bugs discovered have probably not been patched, making it difficult to capture the full scope of their impact. Anthropic has been meticulous about adhering to its rules of responsible disclosure, but it’s possible that bad actors are using similar techniques behind the scenes, even if the models they’re using aren’t as good.
Speaking to recent eventAnthropic CEO Dario Amodei was optimistic that the new tools would ultimately favor defenders. “If we handle it right, we could be in a better position than when we started because we fixed all these bugs. There are only so many bugs to find,” Amodei said. “So I think there’s a better world on the other side of this.”
Having dealt with the nitty-gritty details, Grinstead has a more measured view: “It’s useful for both attackers and defenders, but the availability of the tool shifts the advantage a little bit to the defense. Realistically, nobody knows the answer to that yet.”
When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.
