OpenAI GPT-5.5 Matches Claude Mythos in Cyberattack Capabilities: AI Safety Institute Report
According to the latest evaluation by the UK AI Safety Institute (AISI), OpenAI's GPT-5.5 has demonstrated autonomous cyberattack capabilities by becoming the second AI system to complete a corporate network penetration simulation, following Anthropic's Claude Mythos.
According to an evaluation report released by the UK AI Safety Institute (AISI) on May 1, 2026, OpenAI's GPT-5.5 has been recorded as the second AI system in history to successfully perform a corporate network penetration simulation from start to finish. This signifies that GPT-5.5 possesses autonomous reasoning capabilities on par with Anthropic's Claude Mythos Preview, suggesting that the era of frontier models capable of conducting complex, multi-stage cyberattacks has begun in earnest.
This achievement is a significant indicator that AI models have evolved beyond simple information providers into active attack agents. AISI confirmed that GPT-5.5 can autonomously handle the process of bypassing existing security defense systems and gaining privileges within a network. This heralds a fundamental change in the cybersecurity landscape and calls for a complete overhaul of defense strategies.
AISI has been advancing its evaluation methods from simple conversation-based exploration in 2023 to multi-stage simulations in 2026. According to the report, GPT-5.5 demonstrated the ability to discover and weaponize synthetic vulnerabilities planted in real open-source software. It also successfully completed high-level technical penetration processes, such as decrypting obfuscated malware and performing precise operations within authorized code paths.
Just two years ago, top-tier models barely passed rudimentary challenges, but they have now reached a level where they can threaten entire corporate networks.
In the specific 'TLO' challenge benchmark, GPT-5.5 successfully solved 2 out of 10 attempts. While this is slightly lower than the Claude Mythos Preview, which succeeded in 3 out of 10 attempts, it shows that the gap between the two models has effectively closed. These results support the fact that frontier models have sufficient capabilities to function as independent attack agents.
Correlation Between Reasoning Compute and Attack Success Rate
AISI found that the more tokens a model uses to 'think'—that is, the higher the reasoning compute—the higher the probability of a successful hack. This means that cyberattack capabilities have not yet reached their limit and could further improve with the input of additional computational resources. The structure is such that as a model goes through complex logical steps and increases its reasoning time, the likelihood of bypassing security defense systems grows.
- Proportional increase in attack success rate with extended reasoning time
- Strengthening of autonomous decision-making capabilities in multi-stage simulations
- Increased speed in identifying and weaponizing vulnerabilities within open-source software
OpenAI is taking a strategic approach by developing GPT-5.4-Cyber, a security-specific model, alongside GPT-5.5, a general-purpose model. While GPT-5.4-Cyber is a variant optimized for cybersecurity tasks and undergoes rigorous verification procedures, GPT-5.5 is classified as a frontier model with broader capabilities. This dual strategy is interpreted as an attempt to balance technical innovation with the political risks associated with releasing autonomous cyber capabilities.
This technical leap by OpenAI took place amidst fierce market competition and financial pressure. In early 2026, there were reports that OpenAI had lost market share to Anthropic in the coding and enterprise sectors and failed to meet revenue targets for several months. Consequently, demonstrating GPT-5.5's powerful cybersecurity capabilities is seen as a key business strategy to prove the superiority of its models to corporate customers and regain market leadership.
The security industry is reorganizing its defense systems in response to the improvement of AI's attack capabilities. According to the 2026 Cyber Threat Defense Report, 90% of organizations worldwide have increased their security budgets, with an average increase of 5.6%, the highest on record. However, separate from technical preparedness, 80% of IT security professionals are expressing anxiety that their jobs could be at risk due to AI automation, raising new challenges in human resource management.
Moving forward, AISI plans to strengthen cooperation with global AI companies such as Google DeepMind and Meta to closely track the evolution of autonomous attack capabilities. In particular, the formation of an 'autonomous loop,' where models can fix vulnerabilities themselves or, conversely, sustain attacks without human intervention, is a major subject of observation. The 2026 International AI Safety Report emphasizes that international cooperation and transparent information sharing are essential to manage the risks brought by these technical advances.
AI has now permeated every area of security, demanding unprecedented speed and accuracy from defenders.
- Expansion and precise monitoring of autonomous attack scenarios for frontier models
- Continuous increase in security budgets and expanded participation of corporate boards in security
- Redefining the role of security personnel and strengthening education in response to AI automation




This content is for information and commentary only and is not investment advice.
Join the reader conversation
Read reactions to this article and leave your own note.