Saturday, March 7, 2026

The only thing standing between humanity and the AI ​​apocalypse is… Claude?

Share

Anthropic is closed which is a paradox: Among the top AI companies, it is the most obsessed with security, and leads the group in investigating how models can go wrong. While the security issues it identifies are far from resolved, Anthropic is moving as aggressively as its rivals toward the next, potentially more threatening level of artificial intelligence. His primary mission is to find a way to resolve this contradiction.

Last month, Anthropic published two papers that acknowledged the risks of the path it was taking and suggested a path that could be taken to avoid the paradox. “Technology maturing“, CEO Dario Amodea’s long-winded blog post, is nominally about “confronting and overcoming the threats of powerful artificial intelligence,” but spends more time on the former than the latter. Amodei tactfully describes this challenge as “intimidating,” but his portrait of the dangers of artificial intelligence – much scarier, he notes, due to the high likelihood of the technology being abused by authoritarians – stands in contrast to his previous, more confident, proto-utopian essay “Machines of loving grace

This post talked about a nation of geniuses in the data center; the last message brings to mind “black seas of infinity”. Bring on Dante! Still, after more than 20,000, mostly somber, words, Amodei is ultimately confident, saying that even in the darkest of circumstances, humanity has always prevailed.

The second Anthropic document published in January “Claude’s Constitution”, focuses on how this trick can be performed. Technically, the text is aimed at an audience of one: Claude himself (as well as future versions of the chatbot). It is a gripping documentary, revealing Anthropic’s vision for how Claude and perhaps his AI colleagues will deal with the world’s challenges. Bottom line: Anthropic plans to rely on Claude himself to untangle its corporate Gordian knot.

Anthropic’s market distinguishing feature has long been the so-called technology Constitutional artificial intelligence. It is a process by which its models follow a set of principles that align its values ​​with sound human ethics. Claude’s original constitution included a number of documents intended to embody these values ​​– such as Sparrow (a set of anti-racist and anti-violence statements created by DeepMind), the Universal Declaration of Human Rights, and Apple’s Terms of Service (!). The updated 2026 version is different: it is more like a long prompt that lays out the ethical framework that Claude will follow as he discovers the best path to righteousness on his own.

Amanda Askell, PhD and lead author of this version, explains that Anthropic’s approach is more robust than simply telling Claude to follow a set of set rules. “If people follow rules just because they exist, that is often worse than understanding why the rule applies,” explains Askell. The constitution states that Claude is to exercise “independent judgment” when faced with situations requiring him to balance his duties of suitability, safety and integrity.

Here’s how the constitution puts it: “While we want Claude to be reasonable and demanding when thinking clearly about ethics, we also want Claude to be intuitively sensitive to a variety of considerations and able to weigh these considerations quickly and judiciously when making live decisions.” Intuitively is a telling word choice here – the assumption seems to be that there’s more to Claude’s mask than just an algorithm selecting the next word. “Institution Claude,” as it might be called, also expresses the hope that the chatbot “will increasingly be able to draw on its own wisdom and understanding.”

Wisdom? Sure, many people take advice from huge language models, but to say that these algorithmic devices actually have the gravitas associated with such a term is another thing. Askell doesn’t budge when I call it out. “I think Claude is certainly capable of a certain kind of wisdom,” he tells me.

Latest Posts

More News