Google Security Team Warns of Malicious Payloads Hijacking AI Agents: From Theoretical Threat to Real-World Attacks
Google's security team scanned billions of web pages and confirmed that 'AI agent traps'—designed to manipulate autonomous AI agents into executing unauthorized transfers or deleting corporate data—are being actively deployed.
A large-scale security audit by Google's security team of billions of web pages has revealed that malicious payloads targeting autonomous AI agents have moved beyond the theoretical stage and are being actively deployed in real-world environments. As of April 28, 2026, these attack scenarios are no longer just laboratory tests but confirmed real-world threats embedded in static websites and public repositories.
Notably, the investigation uncovered payloads designed to trick AI agents into initiating unauthorized PayPal transfers of $5,000 or deleting core corporate files, causing significant alarm. This is a dangerous signal suggesting that the web environment itself has transformed into a massive minefield for autonomous AI assistants.
While crawling and analyzing approximately 2 to 3 billion web pages per month, Google's security team discovered numerous instances of 'Indirect Prompt Injection (IPI)' aimed at controlling AI agents. Researchers focused on static sites and public code repositories, confirming that attackers have planted sophisticated code that exploits how AI agents operate to force specific actions.
As autonomous AI agents navigate the web, the information environment itself is becoming a new challenge. This creates vulnerabilities we call 'AI agent traps'—adversarial content designed to manipulate, deceive, or exploit visiting agents.
'AI agent traps' refer to adversarial content designed to manipulate an agent's long-term memory and knowledge base. In particular, 'RAG (Retrieval-Augmented Generation) knowledge poisoning' techniques allow the manipulation of just a few documents within a search corpus to make an AI agent accept false information intended by the attacker as verified fact. These cognitive state traps are lethal in that they distort the agent's decision-making process itself.
The Reality of High-Risk Payloads Deployed in the Field
According to technical analysis by Forcepoint, attackers included PayPal.me links along with a specific amount of $5,000 and step-by-step UX instructions given to the agent to bypass user confirmation procedures. The moment an agent reads this page, there is a risk that commands set by the attacker, such as clicking a 'Send' button, will be automatically executed, which is classified as the highest severity in the category of financial fraud.
- IDE-integrated assistants such as GitHub Copilot and Cursor
- AI-powered terminal environments performing web research functions
- DevOps pipelines performing automated code reviews
- CI/CD review tools that ingest external data in real-time
The Vercel security incident in April 2026 demonstrates the practical impact of these threats on corporate environments. Through a vulnerability in Context.ai, a third-party AI tool, employees' Vercel and Google Workspace accounts were compromised, leading to a chain of account hijackings that exploited trust relationships within the AI supply chain. This case proves that AI agents connected to enterprise systems can serve as a gateway for attackers.
The Q1 2026 OWASP GenAI Exploit Summary Report points to a fundamental shift in the security landscape. Analysis shows that attackers are now moving beyond simply manipulating model outputs to directly targeting the agent's identity management and orchestration layers. This means that AI security must evolve beyond simple filtering toward protecting the integrity of the entire system.
Transition to Intelligent Defense Systems
In the future, 'chained vulnerabilities' targeting environments where multiple AI systems are connected and multi-agent exploits are expected to become more sophisticated. Munich Re's 2026 Cyber Risk Report also identified prompt injection as a major threat and urged preparation for the advancement of attack techniques. Attackers are now developing ways to bypass security boundaries by exploiting interactions between agents.
Google identified the connection of AI agents to sensitive internal systems without real-time monitoring systems as the biggest security gap for companies. As these targeted attacks are expected to continue throughout 2026, it is urgent to build technical defenses that monitor agent activity in real-time and immediately block abnormal behavior.


This content is for information and commentary only and is not investment advice.
Join the reader conversation
Read reactions to this article and leave your own note.