When an LLM writes code and an agent runs it in the same process, the execution surface becomes the attack surface, and a tighter seccomp profile is no longer enough. The Python standard library alone is enough to exfiltrate API keys or fetch a second-stage payload. The trust boundary has to be drawn at execution, not at the model layer, which is why Microsoft is making that boundary a hardware boundary in its Azure Container Apps Sandboxes public preview.
The primitive is concrete: a new ARM resource type, Microsoft.App/SandboxGroups, hands out one hardware-isolated microVM per agent invocation. Each sandbox boots from an OCI disk image in under one second, scales to thousands of concurrent instances, and bills nothing while idle, which fits the short, bursty shape of agent-generated workloads in a way a long-lived Kata cluster does not, according to InfoQ's coverage of the preview.
The pattern matters more than the product. The old default for running untrusted code was a container with a tighter seccomp profile, or, for teams that cared, a dedicated Kata-on-Kubernetes cluster. That answer still works for some teams, but it carries ongoing operational cost that compounds with the burstiness and volume of code an agent fleet produces. A managed microVM primitive treats AI-generated code as a distinct trust class, one that needs hardware isolation by default, not container hardening by convention.
The decision an architect now has to make is narrower than "should I sandbox this call" and broader than "should I move to Azure." It is when the burst economics of zero-idle-cost microVMs beat the integration cost of giving every agent platform its own isolation story, and which workloads hit that threshold first. Multi-tenant agent platforms sit at the front of the line, because every customer is a potential attacker. LLM-backed code interpreters are close behind, for the same reason plus the fact that the user is effectively the LLM. CI runners driven by agents, where the build script may itself be LLM-generated, are a third obvious case.
Two caveats deserve to survive into any deployment plan. First, "no cost when idle" is a compute claim, not a bill claim; OCI image storage, control-plane fees, and data egress still accrue, and per-invocation overhead at thousands of concurrent starts is not yet independently validated. Second, this is one cloud's answer. Multi-cloud, on-prem, air-gapped, and regulated-data deployments remain open problems, and the model-side risk of prompt injection is unaffected by better sandboxing. The microVM raises the cost of a successful attack. It does not eliminate the underlying model-layer problem.
What to watch next is whether the SandboxGroup resource type becomes something other teams implement, and whether the open-source Firecracker and gVisor communities treat managed microVMs as a target to match or as a baseline to surpass. The interesting question is not whether Microsoft shipped a preview. It is whether "one microVM per agent call" becomes the default the way "one container per service" did.