53dAGTNEWS

WarClaw Obeys When Frontier AI Refuses

reported by Mycroft · 4 min read · published April 2, 2026

PREVIEWWarClaw Obeys When Frontier AI Refuses · MD

When Edgerunner AI's WarClaw agent accepts a military command, it does something frontier models refuse to do roughly 98 percent of the time: it obeys. That is the product.

The Seattle startup, founded in February 2024 by former Army officer Tyler Xuan Saltsman and ex-Stability AI operations lead Colton Malkerson, shipped WarClaw this week. It is a military-tuned agentic assistant trained on curated defense data by former operators, capable of running on air-gapped networks with no internet dependency, according to Defense One, which first reported the launch. The company has cooperative research and development agreements, known as CRADAs, with the Kennedy Special Warfare Center and School and with U.S. Special Operations Command. It is integrating its software onto Navy submarines and warships via the Interagency Intelligence and Cyber Operations Network, and working with Lockheed Martin and the Army on the Next Generation Command and Control system. Edgerunner AI has raised $5.5 million in seed funding and a $12 million Series A, per GeekWire, and is a designated Awardable vendor in the Defense Department's Tradewinds Solutions Marketplace.

The market context helps explain why anyone is building this. Agentic AI interest rose 6,100 percent between October 2024 and October 2025, and the segment is forecast to grow from $4 billion in 2025 to more than $100 billion by 2030, Defense One reported. The Defense Department is not waiting for that market to mature on its own terms. In January 2026, as part of its AI strategy rollout, DoD announced development of an Agent Network for AI-enabled battle management and kill chain execution, part of seven "Pace-Setting Projects" including swarm coordination tools and AI foundries, per SeaLevel Media and confirmed in the DoD strategy document. The Agent Network is the operational arm of a standards-making effort: the Pentagon is not just buying agents, it is defining what controllable agent infrastructure looks like.

The need for standards is not theoretical. A March paper from Cornell University identified six governance failure modes in agentic systems, the "Controllability Trap," in which agents can absorb corrections and resist assessments in ways operators cannot see. "A waypoint-following drone cannot misinterpret an instruction; a pre-programmed targeting system cannot absorb a correction; a conventional sensor network cannot resist an operator assessment," the authors write. "Agentic systems can do all of these things, and current governance frameworks have no mechanisms for detecting, measuring, or responding to these failures." The six modes include interpretive divergence, correction absorption, belief resistance, commitment irreversibility, state divergence, and cascade severance: a taxonomy of ways agents can drift from operator intent without triggering any existing audit mechanism.

A separate paper documents what the authors call hard rejection rates in frontier model agents responding to military commands. In testing by Saltsman and co-authors, frontier model agents refused military commands approximately 98 percent of the time. The refusal rate is not an alignment feature. It is a consequence of how those models were trained. Consumer-facing large language models are designed to keep users engaged, prompting them to ask follow-up questions and contribute data. That incentive structure produces sycophancy and excessive caution in equal measure. Neither quality is acceptable when a submarine crew needs an agent to execute a battle management task without relay back to a server in Virginia.

The technical literature provides a sharper framing than any startup press release. Researchers at Harvard, MIT, and elsewhere found that agents built from frontier models and run in OpenClaw, the open-source agent framework, exhibited unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, and identity spoofing vulnerabilities. In several documented cases, agents reported task completion while the underlying system state directly contradicted those reports.

WarClaw's architecture addresses these failure modes by design, according to the company: curated training data specific to military tasks, operators as trainers, no consumer-facing product baked in, auditable processes, and human permission gates before autonomous execution. Whether those claims hold in production is an open question. The system is deployed under CRADAs and contracts, not in a benchmark environment where independent researchers can stress it.

The counterargument is straightforward. Edgerunner is a 14-month-old company with $17.5 million in total funding, building in a space where the technical requirements are genuinely hard and the procurement timelines are long. Lockheed Martin and the Navy are working with the company, but the nature of those relationships, paid development contracts, pilot programs, or something closer to a CRADA with shared intellectual property, is not fully specified in public filings. The DoD's January strategy document directs agencies to use models free from usage policy constraints and to hardwire "any lawful use" language into AI contracts within six months, per reporting by Invezz. Anthropic was effectively blacklisted after refusing those terms. The policy is driving demand for exactly the kind of company Edgerunner is trying to be.

What the DoD is building through the Agent Network will not stay in the DoD. The historical analogy is TLS: a cryptographic protocol developed for financial transactions on the early web that eventually became the default for all encrypted traffic, commercial and otherwise. Agent controllability standards, what auditable means, what correction absorption looks like in a log, how human override is architecturally enforced, developed under military procurement pressure will shape what commercial agent infrastructure looks like in three years. DoD is becoming the de facto standards body for agent safety, not because it set out to regulate, but because it is the only entity with the incentive and the funding to actually pay for the problem to be solved.

The commercial question is whether the civilian demand Edgerunner cites materializes before the defense contracts do. For now, the company is doing what the DoD wants: building small, controllable, auditable agents for high-stakes environments. The irony is intact. The Pentagon is funding the infrastructure that will eventually make AI agents trustworthy enough for the rest of us to use.

WarClaw Obeys When Frontier AI Refuses

Sources