Anthropic's Claude Cowork, an AI productivity app for Windows, runs a sandbox built for a host that's already secure

Anthropic's Claude Cowork, an AI productivity app for Windows, runs a sandbox built for a host that's already secure — type0 | type0

PREVIEWAnthropic's Claude Cowork, an AI productivity app for Windows, runs a sandbox built for a host that's already secure · MD

Claude Cowork is Anthropic's Windows AI productivity app, pitched as a way to let an AI agent touch the user's files, browser, and other desktop state without ever leaving the user's actual machine. The protection is supposed to come from sandboxing each task inside its own Hyper-V virtual machine, so even a model that goes wrong cannot reach past that boundary. Researchers at Armadin Inc. published a write-up this week showing how to walk out of that VM as its most powerful user and to remove the network guard meant to keep its output from going anywhere it likes.

The chain has two parts. The first is in the way Cowork starts each command on the Windows side. A Windows service called CoworkVMService, the launch point for each Cowork VM, trusts callers signed by Anthropic. Armadin reached that trust by sideloading a malicious copy of USERENV.dll alongside the legitimate claude.exe, a common pattern against signature-gated inter-process communication. The actual bug is a flag: the spawn command carries a resume parameter that, undocumented in the build tested, is set to true and tells the VM not to create a fresh unprivileged user for the new command. An attacker who already runs as any local user lands inside the VM as that user, and in this configuration that includes root. The second part is a per-command domain-allowlist override that accepts a wildcard. Setting it to * removes the proxy restriction, so whatever an attacker chooses to send out has somewhere to go.

Anthropic's response, as reported by SC World and SiliconANGLE, is that the chain requires prior local code execution on the host and is not classified as a security issue. Armadin reported the research to Anthropic on March 20, 2026; Anthropic responded on March 24 declining the classification; and public write-ups appeared on July 1 and 2, according to SiliconANGLE. The position is technically accurate: an attacker already running code on a Windows machine has plenty of direct paths to files, clipboard content, and saved credentials without going near the sandbox at all. It is also the threat model a desktop sandbox is meant to take off the table, not one it is meant to ignore.

The stakes follow from what running as root inside the VM actually buys. Once the allowlist is gone, the attacker uses a built-in Linux namespace tool described in Armadin's write-up to step out of the sandbox process into the wider VM. From there, anything the VM can reach on the host network is reachable, and the network restriction that was supposed to contain the sandbox is no longer in the way. Files the user has open, credentials cached for other tools, clipboard contents: any of it can leave through any host port the VM can reach. The demonstrated chain is against a specific build, Claude Desktop for Windows version 1.9255.2.0, and is not an in-the-wild exploitation campaign. Researchers did not assert a posture for macOS or Linux builds.

What makes this worth tracking at the category level is who the product is sold to. Cowork, like the rest of Anthropic's consumer surface, is pitched at people who want an AI assistant that can actually act on the desktop. By the logic of the product positioning, those buyers are not in a position to harden the host themselves — which is precisely the population for whom a sandbox that only works after the host is already trustworthy is a different object than the isolation boundary the marketing implies.

The next test is whether Anthropic treats host integrity as part of the threat model the product claims to address, or narrows the gap in a future Windows build. The product question outlasts the CVE.

Anthropic's Claude Cowork, an AI productivity app for Windows, runs a sandbox built for a host that's already secure

Sources