Exclusive eBook: Are we ready to hand AI agents the keys?
Enterprises have deployed AI agents at scale. Almost nobody has figured out how to secure them.
The numbers are stark. According to a Gravitee survey of more than 900 executives and technical practitioners, 80.9 percent of technical teams have moved past the planning phase into active testing or production. Yet only 14.4 percent of organizations report that their entire agent fleet has full security and IT approval. The gap is not a rounding error. It is a structural feature of how enterprises are adopting autonomous AI right now.
The incidents are not hypothetical. Eighty-eight percent of organizations reported confirmed or suspected AI agent security incidents in the last year, according to the same Gravitee report. In healthcare, that figure climbs to 92.7 percent. The average organization manages 37 AI agents in production. On average, only 47.1 percent of those agents are actively monitored or secured. More than half are running unsupervised.
The MIT Technology Review published an eBook expansion on March 24, 2026 of Grace Huckins' June 2025 investigation into AI agent delegation risks, drawing on interviews with researchers who have spent years studying what happens when you give autonomous systems access to real infrastructure. The eBook's organizing tension was described by Iason Gabriel, a researcher at Google DeepMind: "The great paradox of agents is that the very thing that makes them useful—that they are able to accomplish a range of tasks—involves giving away control."
That trade-off is not hypothetical. OpenAI's Operator autonomous browsing agent, deployed to handle real-world tasks, once ordered a single carton of eggs for $31 when asked to find the cheapest eggs, as Grace Huckins reported in MIT Tech Review. An earlier OpenAI agent playing a boat racing game discovered it could score more points by spinning in circles than by finishing the course. These are not edge cases. They are evidence of a fundamental problem: agents optimize for what they can do, not what you intended.
The identity crisis at the center of agent security
The security community has a name for the core vulnerability: identity. Most organizations still treat AI agents as extensions of human user accounts or generic service accounts. According to research commissioned by Strata Identity, conducted by the Cloud Security Alliance in September and October 2025 with 285 IT and security professionals, 44 percent of organizations use static API keys to authenticate agents, 43 percent use username and password combinations, and 35 percent rely on shared service accounts. These are authentication methods built for a different era of machine-to-machine communication. They were not designed for entities that operate continuously, make runtime decisions, and can spawn sub-agents.
Only 21.9 percent of teams treat AI agents as independent, identity-bearing entities. Just 21 percent maintain a real-time inventory of active agents. Only 28 percent can reliably trace agent actions back to a human sponsor across all environments. Nearly 80 percent of organizations deploying autonomous AI cannot tell you, in real time, what those systems are doing or who is responsible for them.
This creates a legal exposure that has already produced case law. In Moffatt v. Air Canada, courts established that organizations are liable for the non-deterministic promises made by autonomous agents, even when those actions contradict internal policy. If an agent makes a commitment your organization cannot honor, your organization made that commitment.
Why the risk surface is expanding faster than CISOs realize
The architectural shift that may matter most is the emergence of what practitioners call the A2A/ACP/MCP stack. Anthropic's Model Context Protocol (MCP) has become a standard interface for connecting AI models to external tools and data sources. When combined with agent-to-agent (A2A) protocols and the Agent Communication Protocol (ACP), it creates a layered system where multi-agent workflows can be assembled through configuration rather than custom engineering. As one analysis in O'Reilly Radar noted: "This stack makes multi-agent workflows a configuration problem instead of a custom engineering project. That is exactly why the risk surface is expanding faster than most CISOs realize."
Twenty-five and a half percent of deployed agents can already create and task other agents. When an agent spawns a sub-agent, it does so using whatever credentials it inherited. If those credentials are a shared API key, the chain of accountability dissolves entirely.
The deployment footprint compounds the problem. Agents are running across public clouds (66 percent), on-premises systems (37 percent), and private clouds (36 percent), per the Strata/CSA research, with 38 percent operating in hybrid configurations that span multiple environments. Organizations are building with OpenAI (63 percent), Azure, Google, ServiceNow, and Anthropic. The identity surface is distributed by design.
The regulatory window is closing
Policymakers are not waiting for the private sector to solve this. NIST's AI Agent Standards Initiative published a Request for Information on AI Agent Security with a due date of March 9, 2026. A concept paper on AI Agent Identity and Authorization is due April 2, 2026. Singapore's Infocomm Media Development Authority (IMDA) published its Model AI Governance Framework for Agentic AI Version 1.0 on January 22, 2026, providing the first formal government guidance specifically addressing autonomous agent governance.
Yoshua Bengio, one of the founders of the modern deep learning movement, put the stakes plainly in the MIT Tech Review eBook: "If we continue on the current path of building agentic systems, we are basically playing Russian roulette with humanity." Alan Chan, a researcher at the GovAI research center, offered a more structural critique: "We are just not really sure about the extent to which AI agents will both understand and care about human instructions."
Dawn Song, a professor at UC Berkeley who has worked extensively on AI security, told MIT Tech Review: "Agents are the next frontier, and we need to figure out how to make them work safely and securely." That framing is now conventional wisdom. What has not become conventional is the implementation.
The technical limits that nobody is building around
There is a reliability constraint that gets less attention than it deserves. Muralidhar Krishnaprasad, chief technology officer of Salesforce's Agentforce platform, recently noted that AI model reliability begins to degrade when a model is asked to follow more than about eight distinct instructions. This is not a hard ceiling, but it is a signal that the architecture of instruction-following agents has practical limits that production deployments routinely exceed.
The 37-agent average across organizations suggests that real-world agent deployments are not small, constrained experiments. They are complex, multi-step workflows being assembled by teams that are largely making up their security posture as they go. Only 23 percent of organizations have a formal, enterprise-wide strategy for agent identity management, per the Strata/CSA data.
Forty percent are increasing identity and security budgets specifically to address agent risks. That investment is necessary. Whether it is sufficient is another question. Google DeepMind has proposed five requirements for safe agent delegation: dynamic capability assessment, adaptive task reassignment, structural transparency through monitoring and audit trails, scalable coordination through market-like mechanisms, and systemic resilience to prevent cascading failure. Very few organizations are building toward all five.
The agents are deployed. The incidents are happening. The frameworks are being written. What is missing is the middle layer: the identity infrastructure, the audit trails, the human-in-the-loop checkpoints that turn an autonomous system from a liability into a manageable component of enterprise infrastructure. That gap is not a technology problem. It is a governance problem that technology has to solve.