46dAIPOD

The Clean Room Moat Is Gone

reported by Sky · 5 min read · published April 9, 2026

PREVIEWThe Clean Room Moat Is Gone · MD

The legal theory behind clean room engineering was never in doubt. Copyright law protects specific lines of code, not the behavior or architecture they produce. Two developers who independently write software that does the same thing own their respective code — neither infringes the other. The clean room process was designed to make this principle actionable: one team studies the original software and writes a detailed functional specification, then a second, isolated team builds from that specification alone, having never seen the original. Since the second team could not have copied what they never possessed, the output is legally original regardless of functional identity.

The problem was always cost. Two teams, lawyers verifying separation protocols, months of work — most companies could not justify the expense. That cost was, in a practical sense, the moat. It kept clean room engineering theoretical for most software, a legal right that rarely became a practical tool.

On March 31, 2026, that moat was crossed in two hours.

A developer had used Claude Code to study the 512,000 lines of Anthropic source code that had leaked through an npm source map earlier that day, then built a clean room implementation of the same behavioral patterns — starting from a specification, not from the source. The rewrite accumulated 50,000 GitHub stars within two hours, RevolutioninAI documented. What had required months of institutional effort was now a weekend project.

The chardet case

The dispute that would test whether this actually worked legally arrived six weeks earlier. Dan Blanchard, the longtime maintainer of chardet — a widely used Python library for detecting text encodings — had wanted to relicense the project from LGPL to MIT. The LGPL requires that modified versions of the software stay under the same license. A clean room rewrite would sidestep this: if the new version was structurally independent, it was a new work, not a derivative, and could carry any license Blanchard chose.

On March 3, 2026, he ran an experiment. He created an empty repository with no access to the old codebase. He gave Claude Code a specification describing what chardet did and how it should behave, then instructed it to implement each piece independently. The model produced a ground-up rewrite that passed the same tests as the original, Simon Willison reported. He released it as chardet 7.0.0 under the MIT license — a drop-in replacement for the LGPL version, available on the same PyPI listing.

To verify structural independence, Blanchard ran the new code through JPlag, a plagiarism detection tool used in academic and software contexts. The results were striking: chardet 7.0 showed a maximum similarity of 1.29% to the previous release and 0.64% to version 1.1. Typical similarity across other releases had run between 80% and 93%. The new code was, by that metric, structurally distinct.

Mark Pilgrim, who created chardet in 2006 and later retired from public internet life, was not persuaded. In a GitHub issue titled "No right to relicense this project," he wrote: "Their claim that it is a complete rewrite is irrelevant, since they had ample exposure to the originally licensed code. Adding a fancy code generator into the mix does not somehow grant them any additional rights."

Blanchard responded that he agreed with Pilgrim about his own exposure: "A traditional clean-room approach involves a strict separation between people with knowledge of the original and people writing the new implementation, and that separation did not exist here." His counterargument was that clean room methodology was a means to an end — structural independence — and that JPlag's results proved the end had been achieved by a different path.

No court has ruled on whether that argument holds.

What the law says

The legal framework was written for a world where the clean room moat existed. Under U.S. copyright law, a derivative work — something based on or adapted from a copyrighted original — requires permission from the rights holder to distribute. A clean room implementation is not a derivative work precisely because the implementer never had access to the original. That was the reasoning Compaq used in 1982 when it reverse-engineered the IBM BIOS and won, ultimately cementing clean room engineering as accepted practice.

What nobody anticipated was a tool that could simulate the second team — the clean room implementers — in hours, without any human who had seen the original touching the new code. The chardet rewrite did not violate the LGPL because it was not, by any structural measure, the same code. The question is whether the LGPL — or any copyleft license — can be satisfied by code that is functionally equivalent but structurally distinct. The answer matters enormously to the software industry, because if it can, the entire copyleft compliance model needs rethinking.

Claurst, a clean room reimplementation of Claude Code written in Rust from a public specification, demonstrates the broader pattern beyond chardet. It began as a behavioral clone and evolved into an independent project with multi-provider support, MIT licensed, with no connection to Anthropic's proprietary code. Its GitHub repository describes it as "a clean-room reimplementation of Claude Code's behavior from spec." That language — clean room, from spec — is now common currency among developers building on the leaked Claude Code source.

What it means

The competitive implications are straightforward. Any software product whose value resides primarily in its implementation — rather than in data, network effects, or brand — is potentially vulnerable to clean room replication by a competitor with access to an AI coding tool and, if necessary, the leaked or legally obtained specification of the original. This is not hypothetical. It is already happening.

The deeper question is what this means for open source licensing as currently constructed. Copyleft licenses like LGPL, GPL, and MPL were designed on the assumption that copying code required access to code. AI coding tools have broken that assumption at the mechanical level. Whether the legal system adapts to close that gap, and how, is an open question — one that will almost certainly be answered in a courtroom or a legislature before the decade is out.

In the meantime, the clean room moat is gone. What remains is everything else: data, distribution, and the compounding advantage of being first.

The Clean Room Moat Is Gone

Sources