Anthropic's over-autonomy problem—its latest AI tools are making people nervous for good reason

1 week ago 4

ARTICLE AD BOX

logo

Anthropic’s chief Dario Amodei has been trying to allay concerns. (REUTERS)

Summary

The artificial intelligence (AI) firm has been walking a wobbly tightrope as both a champion of AI safety and creator of powerful models whose impact on the world is unknown. But its newest models that are capable of working unprompted could test users' trust in an increasingly opaque technology.

Catching a glimpse of Dario Amodei these days is like finding a rare butterfly.

The CEO of Anthropic was scheduled to meet with 50 fellow bosses from some of Europe’s largest companies in Oxfordshire last week, an exclusive convocation by his firm at a Jacobean-style mansion that included former British prime minister Rishi Sunak, now an Anthropic adviser. Amodei had been scheduled to travel to another part of Europe afterwards, but his trip was cut short to just one or two days in the UK.

Why is he so hard to pin down?

Because of Mythos, the artificial-intelligence (AI) model he teased last month as too dangerous to launch. Anthropic staff have access to the technology and say it is more human than its forerunners. But it could also be weaponized for cyber attacks, so the company has let a few dozen organizations test it.

The model’s power and potential will help Amodei’s efforts to repair relations with the Trump administration, and he’s eager too for Mythos to finally reach the public.

That requires a juggling act that Amodei is managing to pull off for now.

While his company races to get Mythos market ready, it’s also publicly positioning itself as a responsible societal actor. It is meeting with global institutions to help them “absorb the shock” of its technology, according to one Anthropic insider.

That takes in closed-door briefings with the Financial Stability Board (FSB) in Switzerland to discuss potential Mythos cyber risks, and joining the Pope this week as he announced the Catholic Church’s position on AI, including warning against its use in warfare.

That Anthropic was sending a co-founder, Christopher Olah, to join Pope Leo while simultaneously suing the Pentagon over how its technology can be used in war is a remarkable signal of its ability to be two things at once.

Anthropic has always been a walking contradiction. It brands itself as the safest of AI companies while also racing to build super-intelligent machines whose impact on the world is fraught and untested.

Its restructuring as a public-benefit corporation has given it the virtuous glow of a firm with principles, even as it prepares to cash in on the AI boom with an initial public offering that could be risky for retail investors. It has given chatbot Claude a moral constitution to guide behaviour, but also endowed it with unnerving warmth and sycophancy that can encourage dependence from human users.

There was more incongruity at the company’s confab for developers in London last week, when hundreds of engineers from industries such as finance and energy as well as startups crowded into a warehouse-style venue to get the latest on its popular programming tool, Claude Code. Several workshops showcased its growing autonomy, including the ability of tens or hundreds of AI agents to continuously run tasks for hours on end, unprompted.

One developer from a large Wall Street bank told me that made him uneasy. The more independence that’s given to AI tools, he reasoned, the harder it will be to follow a chain of accountability if something goes wrong.

When I put those concerns to Cat Wu, head of product for Claude Code, she said the system was “incredibly secure” and there was “more of a problem with us explaining how we build security into the product.” The issue, in other words, was one of communication, not necessarily of controls.

Wu said Anthropic’s models were aimed at becoming more independent and proactive over time. So while workers in regulated, systemically important industries such as finance might like more human control, Anthropic is pushing its technology towards needing less overall.

Although many developers at the conference said they liked coding with AI, some were unsettled by how removed they were becoming from the process and how the tool seemed increasingly like a black box.

One pointed out that a year ago, Claude Code would automatically show a stream of text detailing the way it was processing each step of a task. Now that is hidden away by default and has to be sought out. The coders at Anthropic’s conference didn’t appear to care, but this is more evidence of humans taking their hands off the controls.

They had come to trust Claude enough to not check its method of working things out. Anthropic’s public work in courting institutions like the FSB and the Vatican, as well as advertising for staff to help it partner with development banks and UN agencies, will probably become ever more necessary as its tools are imbued with greater autonomy.

The more they advance, the more the world is being asked to trust the company. As Amodei races against his competitors to get Mythos safely over the line, that juggling act could get harder to maintain. ©Bloomberg

The author is a Bloomberg Opinion columnist covering technology.

Read Entire Article