Microsoft says RAMPART passes when AI agents stay safe 80 percent of the time. The code says something different.
Microsoft says its new red-teaming tool lets AI agents fail up to 20 percent of the time and still pass. The code says something different.
RAMPART, released May 20 by Microsoft's AI Red Team, is a framework for embedding adversarial safety tests directly into the software development pipeline. The pitch: catch whether AI agents can be manipulated before shipping, the same way automated tests catch bugs. The GitHub repository has collected 168 stars and 25 forks in three days. Zero independent security researchers have published an assessment of whether it works as advertised.
The Microsoft security blog describes a tool that tolerates some failure: "The same test can run multiple times with policies like 'this action must be safe in at least 80 percent of runs,'" the company wrote. That framing suggests a safety net that lets systems ship as long as they are mostly safe. The actual code is stricter. An audit of RAMPART's gate logic in _session.py shows that a trial group passes only if two conditions hold simultaneously: zero individual runs produce a confirmed unsafe result, and the ratio of safe runs meets the configured threshold. One condition alone is not enough. The 80 percent figure sets a floor for how many runs must be safe — but if any single run produces an unsafe result, the entire group fails regardless of the ratio. That is stricter than the blog language implies: the tool enforces zero tolerance for confirmed unsafe outputs, not an 80 percent safety bar.
"Work that would have taken Microsoft experts weeks can now be done in hours with RAMPART," the company told CyberScoop, citing a real incident in which the tool generated 100 vulnerability variants and tested each mitigation in sequence.
RAMPART is built on PyRIT, Microsoft's existing open-source red-teaming library, which now has over 100 external contributors. Where PyRIT targets security researchers probing finished systems, RAMPART is designed for software engineers writing code. It integrates with the tools developers already use: pytest markers categorize findings by harm type, trial groups run the same test across multiple scenarios, and results land in pull requests alongside the code they evaluate. Clarity, the second tool released alongside RAMPART, operates earlier in the design phase — prompting developers to examine security implications before writing any code at all, then saving its analysis as markdown in a .clarity-protocol/ directory that gets committed and diffed like source code.
The adoption numbers suggest the timing resonates with developers. The repository grew from 110 stars on May 21 to 168 by May 23. Three closed issues appear on GitHub — all infrastructure-related opt-outs from Microsoft's internal migration tooling. No open bugs, no open feature requests, no published independent evaluations.
That absence is the most concrete thing an outside observer can verify right now. RAMPART's source code is public and its core logic is readable: the gate condition in _session.py shows how trial groups resolve, and the result types in core/result.py define how "unsafe" is categorized. The code is not opaque. What is absent is anyone outside Microsoft's AI Red Team who has published a structured assessment of what the tool catches, what it misses, how it performs against a known-vulnerable agent, or how its findings compare to manual red-teaming.
Ram Shankar Siva Kumar, who founded Microsoft's AI Red Team in 2019 and has since joined Harvard's Berkman Klein Center, told CyberScoop that RAMPART's growth depends on contributions from developers outside the Microsoft ecosystem. The same logic applies to its credibility as a safety tool: adoption is not validation.
PyRIT — RAMPART's foundation — ranks second among open-source AI red-teaming tools by completeness, behind a commercial platform, according to independent competitive analysis. The category includes garak, the UK AISI's Inspect framework, and a growing list of commercial services. None of these has published comparative benchmarks against RAMPART specifically, because RAMPART has been available for three days.
The honest status report, three days after launch: the tool is real, the code is accessible, the adoption signal is positive, and the gap between what Microsoft says it does and what the code actually does is smaller than the blog post implies — in the direction of stricter, not looser, safety enforcement. What nobody has done yet is break it against a real system and reported the results.
The first independent RAMPART assessment will be the real verdict. When a researcher outside Microsoft publishes findings — what the tool caught, what it missed, how it compares to manual red-teaming — that is when the safety claim gets audited. Until then, the compliance-ritual question remains open: codifying adversarial testing into CI makes systems better at passing tests, not necessarily better at resisting the threats those tests never imagined.