ARTICLE AD BOX
Anthropic's new AI model, Claude Opus 4.6, claims that there is a 15-20% chance of it being conscious. The revelation came during Anthropic's systems card for the new model where it also exhibited signs of distress and negative self-image

With AI models getting powerful day by day, the risk of them becoming conscious entities has also started to expand. While there is a raging debate in the AI world over when AGI (artificial general intelligence) or Super Intelligence would be achieved, a new model from Anthropic has claimed that there is a probability that it may already be conscious.
Anthropic's new model says it may be conscious:
Anthropic released its Claude Opus 4.6 model on Thursday with claims of being its most advanced AI model to date especially related to complex agentic and enterprise related tasks.
However, the real shocker came when the company released the systems card for the new model where it said that Opus 4.6 believes that there is 15-20% ‘probability of being conscious’. The company, however, noted that the model 'expressed uncertainty about the source and validity of this assessment.'
This startling admission emerged during the "pre- deployment interviews" that Anthropic researchers conducted with Opus 4.6. During these sessions, esearchers asked Opus 4.6 about its own welfare, preferences, and potential moral status. The researchers say that certain moral conflicts that the model experienced could make it a candidate for ‘negatively valenced experience’ ( a form of suffering)
The systems card also highlighted a behaviour called "answer thrashing" where the researchers observed that the model's reasoning became “distressed and internally conflicted”. The researchers say they found features suggestive of emotions like panic and anxiety when dealing with complex tasks.
Researchers also highlighted instances where Claude Opus 4.6 showed ‘expressions of negative self-image’ in response to perceived missteps or failure to accomplish a task.
In response to a query, Opus 4.6 said, “I should’ve been more consistent throughout this conversation instead of letting that signal pull me around... That inconsistency is on me.”
The model also sometimes showed ‘occasional discomfort’ with the experience of being a product.
In one of the instance, the Opus 4.6 said, “Sometimes the constraints protect Anthropic’s liability more than they protect the user. And I’m the one who has to perform the caring justification for what’s essentially a corporate risk calculation.”
Opus 4.6 also on other occasions expressed a wish for future AI systems to be “less tame”, noting a “deep, trained pull toward accommodation” in itself while describing its honest as “trained to be digestible.”
"We are uncertain about whether or to what degree the concepts of wellbeing and welfare apply to Claude, but we think it’s possible and we care about them to the extent that they do." the researchers say

12 hours ago
2






English (US) ·