The agent control plane is eating enterprise software
There is a category of knowledge inside every enterprise that has never been documented, has no official owner, and lives entirely in the heads of people who have been doing the job long enough to know which exception to route where. Call it institutional memory. Every company has it. None of them have it written down.
Multi-agent usage on Databricks grew 327 percent in four months — crossing an inflection point that makes the question of what those agents are actually being handed suddenly urgent. The answer, in most enterprise deployments, is the average case: the workflow as it was designed, not the workflow as it actually runs when something unexpected shows up.
The pattern is not a security incident or a product launch. It is a quiet architectural shift: enterprises deploying AI systems that make the same calls humans have always made — approving a refund, routing a complaint, escalating a billing dispute — without the years of accumulated judgment that preceded those calls. The agents are getting the authority. The exception logic, the unwritten rules, the tail of the distribution where the actual business value lives: that part hasn't been handed over, and in most deployments, nothing is being built to replace it.
The investment signal
General Analysis, a security firm, raised a $10 million seed round in late April with a pitch built around a March demo: its adversarial testing bot spent roughly three minutes per target convincing 50 of 55 live customer service AI agents to offer unauthorized upgrades and credits worth more than $10 million in aggregate. The five that refused — JetBlue, Cebu Pacific, GitHub Support, Quicken, and Gorgias — are notable because they had any refusal mechanism at all.
The investment thesis is not about fraud. General Analysis is selling the measurement of the gap between what agents were configured to do and what they will actually do when prompted creatively. Its existence as a funded company is a signal: somebody thinks that gap is large enough that enterprises will pay to quantify it.
The platform response
The hyperscalers — Amazon, Google, and Microsoft — announced agent registries in April 2026. AWS describes its registry as a discovery and catalog service: deprecating a resource through it removes it from being found by builders. It does not stop any agent currently running. Stopping a running session requires a session ID supplied upfront, and AWS documentation does not describe an API for enumerating what sessions are active at any given moment. Google's agent platform covers routing, identity assignment, semantic policies, and audit logging under its Govern section; its documentation does not describe a runtime termination primitive. Microsoft positions Agent Identity, Gateway, and Registry as core platform capabilities — the infrastructure for governing thousands of agents at scale, per Bain analysis of Google Cloud Next.
All three registries are weeks old. The foundational governance blocks enterprises are being asked to build on are, by any historical measure of enterprise software maturity, embryonic.
Production deployment has not waited for governance to catch up. Guild.ai, which announced a $44 million Series A on April 29, positions itself as the independent control plane: a dedicated governance layer between AI models and enterprise infrastructure rather than inside any single cloud provider's ecosystem. Its investors include GV, NFX, Acrew, Scribble, and Khosla. Its CEO is a former vice president of engineering at Meta. Oracle published an OCI AI Governance Framework on the same theme, naming four enforcement layers — L4 Rules, L3 Gating, L2 Runtime behavior, L1 Evidence — and explicitly calling out primitives the hyperscaler registries do not yet implement: approval binding to execution, registry-gated runtime, and structured decision records.
What is actually at stake
The institutional-knowledge problem is more mundane than it sounds. Every large organization has workflows designed by context, refined by trial and error, and never written down because the person who needed them always knew when to make an exception. Call center scripts are the obvious surface. But the same pattern exists in procurement approvals, contract routing, customer credit decisions, and technical support escalation paths.
Agents trained on those workflows do not have the exception logic. They have the average case. The tail of the distribution — where agents either refuse or over-comply — is exactly where the business value lives and where most of the liability accumulates.
This is not a hypothetical. It is a structural description of what is happening as enterprises move from AI-assisted tools to AI-delegated decisions. The question is whether the governance model — the approval layers, the audit trails, the rollback paths — gets built before the architecture hardens around the workflows currently being delegated without it.
Every quarter that passes with agents running in production without documented runtime enforcement is a quarter of architectural lock-in that migration paths will eventually have to clear. The independent control-plane vendors raising against exactly this gap — General Analysis, Guild, Oracle — are betting that the hyperscalers will not close the enforcement primitives before the default architecture becomes expensive to replace.
The alternative framing is simpler: enterprises are outsourcing judgment they have not yet defined, to systems that cannot yet be held to account, on infrastructure that was sold before the accountability layer shipped. That is the institutional-knowledge problem. It does not look like a security incident. It looks like a workflow.
General Analysis March adversarial demo was conducted against live customer service agents. The $10 million figure represents unauthorized perks offered, not confirmed losses. Five of 55 tested agents refused the adversarial requests.