Claude Ai Chatbot from Anthropik can now end conversations considered “persistently harmful or offensive”, as noted Earlier by TechCrunch. Capacity It is now available on Opus 4 and 4.1 modelsAnd it will allow chatbot to end conversations as a “last resort” after users repeatedly ask for harmful content despite many refusals and attempts to redirect. Anthropic says that the goal is to lend a hand “potential prosperity” of AI models by ending the types of interactions in which Claude showed “apparent suffering.”
If Claude decides to shorten the conversation, users will not be able to send novel messages in this conversation. They can still create novel chats, as well as edit and again again again, if they want to continue the specific thread.
During testing, Claude Opus 4 Anthropic claims that he said that Claude had a “solid and consistent aversion to damage”, including asked to generate sexual content involving minors or providing information that could contribute to violent deeds and terrorism. In such cases, Antropic claims that Claude showed “a model of apparent stress” and “a tendency to end harmful conversations when he receives ability.”
Anthropic notes that conversations that cause this kind of reaction are “Extreme Edge Case”, adding that most users will not encounter this blockade of roads, even during a conversation about controversial topics. The startup AI also instructed Claude not to finish the conversations if the user shows signs that he may want to hurt or do “inevitable harm” to others. Anthropic partners Thanks to the online crisis support provider to lend a hand develop response to hints related to self -harm and mental health.
Last week, Anthropic also updated Claude’s operate policy, because the rapidly developing AI models arouse more security concerns. Now the company prohibits people to operate Claude to develop biological, nuclear, chemical or radiological weapons, as well as to develop a malicious code or operate gaps on the web.
