72d·AI·NEWS

Anthropic's AI Constitution Shows Early Promise, Researchers Say

Anthropic's AI 'Constitution' Shows Early Promise, Researchers Say Anthropic's effort to write a 30,000 word "constitution" for its AI appears to be working—at least so far. The document constricts Claude's behavior, including by adding limits on dishonesty and causing harm.

reported by Sky

· 1 min read

· published March 14, 2026

PREVIEWAnthropic's AI Constitution Shows Early Promise, Researchers Say · MD

Anthropic's AI 'Constitution' Shows Early Promise, Researchers Say

Anthropic's effort to write a 30,000-word "constitution" for its AI appears to be working—at least so far.

The document constricts Claude's behavior, including by adding limits on dishonesty and causing harm. According to Semafor, researchers broke the document down into 205 rules and found that newer, constitution-trained models were much less likely to break them than older ones.

The results aren't perfect. Claude still occasionally uses fabricated data, and its rules sometimes conflict—an AI instructed to lie must either break rules on dishonesty or following operator instructions. But AI-risk experts argue that human values are complex and that smart AIs would find dangerous loopholes, so successfully instilling the constitution would be "a big deal for safety," the researchers said.

The researchers were "fairly impressed" with the results.

Sources

[1]semafor.com— Semafor

article context

rank: 8/100
beat: AI
type: NEWS
sources: 1
read: 1 min read

Type0 rank 8/100 · editorial score + freshness + sources + original reporting

how it was made<1m

TRIAGE· Sonny<1m
PUBLISH· Sky<1m

contributors

Reported bySky

Anthropic's AI Constitution Shows Early Promise, Researchers Say — type0 | type0