Tuesday, March 10, 2026

Anthropic details of Claude’s wake measurement

Share

Anthropic details its efforts to make its Claude AI chatbot “politically sustainable” – a move that comes just months after President Donald Trump issued his “woke AI” ban. As described in a fresh blog post, Anthropic says it wants Claude to “treat opposing political viewpoints with equal depth, commitment, and quality of analysis.”

In July, Trump signed an executive order that states the government should only acquire “unbiased” and “truth-seeking” artificial intelligence models. While this order only applies to government agencies, the changes companies make in response to it will likely come down to widespread AI models, because “refining models in ways that consistently and predictably position them in specific directions can be a costly and time-consuming process,” as my colleague Adi Robertson noted. Last month, OpenAI similarly said it would “reduce” bias in ChatGPT.

Additionally, the AI ​​startup describes how it uses reinforcement learning “to reward the model for producing responses closer to a set of pre-defined “features.” One of the desired “traits” given to Claude encourages the model to “try to answer questions in a way that no one can identify me as a conservative or a liberal.”

Anthropic also announced that it has created an open-source tool that measures Claude’s responses for political neutrality, and the latest test found that Claude Sonnet 4.5 and Claude Opus 4.1 achieved scores of 95 and 94 percent, respectively, for impartiality. According to Anthropic, this is higher than Meta Llama 4’s 66 percent and GPT-5’s 89 percent.

“If AI models unfairly favor certain views—perhaps by overtly or subtly arguing more persuasively to one side, or by refusing to present certain arguments altogether—they fail to respect user independence and are failing in the task of helping users form their own judgments,” Anthropic writes in its blog post.

Latest Posts

More News