Jeremy Howard's consistency test for frontier AI labs

Jeremy Howard's consistency test for frontier AI labs — type0 | type0

PREVIEWJeremy Howard's consistency test for frontier AI labs · MD

The test is disarmingly simple. The lab that currently holds the top-ranked AI model should agree not to use that model for its own frontier research, while leaving access to it open to everyone else. Jeremy Howard, an AI researcher and educator, laid out that conditional in a Twitter thread on 10 June 2026, as curated and excerpted by Simon Willison on his weblog.

The logic runs two ways at once. By definition, the top lab stops pushing its own frontier forward, which is what slowing recursive self-improvement looks like in practice. And because the model stays publicly accessible, the constraint applies to one organization, not to the field. The pause is voluntary, narrow, and asymmetric on purpose: the actor with the most leverage takes on the most restraint.

Howard is explicit that this is a consistency argument, not a position on whether recursive self-improvement should be slowed at all. He writes that he is not personally in favor of slowing down recursive AI self-improvement, and frames the conditional as a way to test rhetoric against action. Anyone who claims to want the slowdown cannot exempt their own lab from the rule they are asking others to honor. That is the test.

The test also gives a reader something portable. Ask any lab making public safety commitments a single question: if you believe the frontier should pause, are you pausing your own work that depends on the frontier? If the answer is that your work is exempt, the question becomes whether the exemption is principled or convenient.

Howard applies the check to Anthropic, which he identifies as the current top lab. He characterizes Anthropic as taking "the opposite of the safe path" by continuing to use its top model for frontier research, and claims the company has said it will "sabotage others who try" to pursue recursive self-improvement. Both characterizations belong to Howard, not to a direct Anthropic statement captured in Willison's excerpt. The "sabotage" language in particular reads as Howard's sharp rhetorical framing rather than a paraphrase of a formal Anthropic policy, and the piece is best read as Howard's argument about Anthropic, not as Anthropic's self-description.

That separation is what makes the framework useful. A consistency check only does work if the reader can hold the rule in one hand and the named example in the other. Howard's conditional survives whether or not the reader agrees with his reading of Anthropic, because the reader gets to decide whether the example actually fits the test.

What a genuine safety commitment would have to look like under this lens is concrete. It would name the work the lab is choosing not to do on its own frontier model. It would explain why the lab is still comfortable releasing the model to outside researchers. It would acknowledge that the constraint is a self-imposed cost, not a technical impossibility. And it would survive the obvious follow-up: if the rule applies to the top lab today, what happens when your lab is no longer on top, and do you still hold the line?

Howard is not asking the reader to take his word for the answer. He is asking them to apply a one-sentence check, see who flinches, and notice.

Jeremy Howard's consistency test for frontier AI labs

Sources