The same UK security research body ran the same CTF against GPT5.5. GPT5.5 got the same result as Mythos.
Anthropic promised us that Mythos was such an existential threat that it would compromise "every OS and browser on devices across the planet". They've held conferences and meetings with banks and govts across the world, shouting how critical this issue is.
GPT5.5 has been out for a month. Every device on earth has not been breached yet. It's very fair to criticize Anthropic's maximalist posturing when it's becoming exceedingly clear their models are fairly behind OpenAI's in capability.
In my opinion, the original commenter's statement stands, and the UK govt data point only helps support that due to the equal result between Mythos and GPT.
I'd advise reading into the specifics of what happened with Firefox; the TL;DR is a reduced safety version of its code was scanned by Opus 4.6 (yes Opus) and found a multitude of bugs and 4 high severity vulns that did not escape sandbox. The Mythos system card test describes running Mythos against the same issues Opus found to see if it could reliably replicate and chain together an attack.
Autonomous loitering munitions with 'AI' (image classification CNNs) are already in service and have been used - most demonstrably by the IDF.
Even during the Nagorno-Karabakh war, Azeri loitering munitions were able to suppress Armenian air defenses by hitting them when they rolled out of of concealment. I believe that killchain requires a level of autonomous functionality.
Azerbaijan was buying a lot of weapons from Israel prior to Nagorno Karabach war, so it is very likely that you have been talking about same weapon system in both cases.
However Russians and Ukrainians are using AI recognition in recon drones, but not yet in FPV. There is strong suspicion that long range one way attack drones are using AI during terminal guidance, but I did not see it confirmed by either side.
Anthropic promised us that Mythos was such an existential threat that it would compromise "every OS and browser on devices across the planet". They've held conferences and meetings with banks and govts across the world, shouting how critical this issue is.
GPT5.5 has been out for a month. Every device on earth has not been breached yet. It's very fair to criticize Anthropic's maximalist posturing when it's becoming exceedingly clear their models are fairly behind OpenAI's in capability.
In my opinion, the original commenter's statement stands, and the UK govt data point only helps support that due to the equal result between Mythos and GPT.
I'd advise reading into the specifics of what happened with Firefox; the TL;DR is a reduced safety version of its code was scanned by Opus 4.6 (yes Opus) and found a multitude of bugs and 4 high severity vulns that did not escape sandbox. The Mythos system card test describes running Mythos against the same issues Opus found to see if it could reliably replicate and chain together an attack.
reply