For subscribers
Is Anthropic hallucinating when it calls for an AI development break?
Any effective guard rails will have to come from the US and China – the leaders in AI research.
Sign up now: Get ST's newsletters delivered to your inbox
Anthropic warned that a minor error in AI’s alignment could cause the technology to spiral completely out of human control.
PHOTO: AFP
On June 4, San Francisco-based AI lab Anthropic sounded an urgent alarm. Artificial intelligence, it said, has begun building itself. Humans are no longer its primary builders. They are in danger of losing oversight of it.
Anthropic then called for a global pause on the development of the technology, echoing calls made by other tech elites in recent years.
Releasing new internal data in a 10,000-word report, When AI Builds Itself, the firm went on to warn that a minor error in AI’s alignment could cause the technology to spiral completely out of human control.
Some have been sceptical about Anthropic’s call to consider the option of a pause in the AI arms race, given that it comes at a time when the company is chasing a US$1 trillion (S$1.28 trillion) valuation ahead of its planned Wall Street debut. Anthropic has always positioned itself as a conscientious, safety-focused outfit and its call has made sure that it stays in the headlines.
But it would be wrong to dismiss the concerns flagged by the company for that reason. The truth is that Anthropic’s document points to an undeniable and terrifying reality.
But it is equally true that no one is likely to heed the call to consider a pause on the advent of AI.
AI taking charge
Anthropic’s popular AI tool Claude is writing over 80 per cent of the firm’s production code and driving an eightfold increase in engineering productivity.
Claude also handled open-ended tasks (such as diagnosing and fixing system crashes after a routine upgrade) with a success rate of 76 per cent in May, compared with 26 per cent six months before.
In April, Claude autonomously diagnosed and fixed over 800 complex errors within Anthropic’s systems, doing what would have taken a human four years of continuous work.
When tested on how effectively it tweaks code to make AI programmes run faster, the May 2025 version of Claude (Opus 4) achieved a threefold performance increase. The newer Claude Mythos Preview (April 2026) made the AI programs run 52 times faster.
The bottom line: The rise of autonomous systems has reduced human safety reviews to dangerously short timeframes.
“If systems are capable of fully building their own successors, the ways we secure them, monitor them and shape their behaviour all grow much more important,” said Anthropic’s June 4 report.
It’s not going to happen
The idea of an AI pause isn’t new.
In March 2023, an open letter by non-profit advocacy group Future of Life Institute had called for a similar safety halt, citing uncontrollable development risks. It urged governments to impose a moratorium if AI firms failed to pause quickly.
But no AI lab stopped training. Neither was there a moratorium. Instead, competition intensified.
Anthropic’s current appeal will likely face the same fate over similar pushbacks.
First things first. Don’t expect any company or nation to initiate a unilateral pause, particularly in the escalating AI arms race between the US and China.
The US has even constructed a complex, evolving web of export controls – targeting everything from advanced semiconductors and manufacturing equipment to global investments – to slow China’s AI progress. Undeterred, Beijing accelerated its push for independence by subsidising a domestic semiconductor ecosystem led by firms like Huawei, and supporting highly efficient AI models from home-grown start-up DeepSeek.
Under these circumstances, how realistic is it to expect the US and China to commit to an AI ceasefire?
Anthropic acknowledged that if one lab stops development, its rivals will simply race ahead. Thus, it floated the idea of a framework where everyone can verify each other’s code and hit the safety brakes together before the technology spins out of human control.
Such a framework is possible in principle.
Take the landmark Intermediate-Range Nuclear Forces Treaty signed by the then US President Ronald Reagan and then Soviet General Secretary Mikhail Gorbachev in 1987 to govern Euro-Atlantic security.
The arms control agreement prohibited both nations from possessing, producing or flight-testing a ground-launched cruise missile with a range of 500km to 5,500km, or launchers of such missiles. It relied on imaging surveillance and on-site inspections for compliance checks.
But years of mutual allegations of treaty violations led to its collapse in 2019, proving how fragile these agreements are when trust is lacking.
Similarly, no one is likely to take a pause in today’s AI race simply because they don’t expect their rivals to do so. What’s more, AI training – unlike physical missile production – is easy to hide and continue in secret. Whoever keeps training when others stop will win the race.
In what is the latest development in this intensifying arms race, the Trump administration issued an export control directive ordering Anthropic to disable access to its most advanced AI models – Claude Fable 5 and Mythos 5 – for all foreign nationals, citing national security and cybersecurity vulnerabilities. Anthropic has pulled both models offline globally for now while it navigates the compliance fallout. The sweeping directive marks unprecedented anxieties over who controls the most advanced frontier AI.
So, is Anthropic hallucinating in calling for an AI development break?
It’s not, when it comes to flagging AI’s technical capabilities. Its internal benchmarks show that AI is really learning to build itself at break-neck speed.
But Anthropic may be experiencing a reality malfunction when it comes to human politics and economics. In a world driven by cut-throat commercial and geopolitical rivalry, a coordinated slowdown is impossible, no matter how scary the technical data looks.
Plus, the impending market debuts of AI titans like OpenAI, SpaceX and Anthropic will only intensify, not ease, competition.
Unreliable messenger, sobering message
Let’s get this out of the way. Anthropic is not the most reliable of messengers and has been accused of hypocrisy.
The firm claims to oppose the use of AI for military purposes. But Financial Times reported on June 5 that it had despatched half a dozen engineers inside the US National Security Agency to use its frontier Claude Mythos Preview model for offensive cyber operations. This is the same model it had withheld from public release, saying it was too powerful.
Also, as it races towards a trillion-dollar valuation, critics say Anthropic used its warning as a hidden marketing campaign. The announcement had focused heavily on promoting Claude’s powerful features and how the company is winning the capability race.
Also, while any government-enforced pause on AI development would paralyse smaller firms, it would only cement Anthropic’s position since it already has its top-tier models and has secured customers.
But all this does not detract from the fact that the concerns raised by Anthropic are valid and need to be addressed.
Its call to consider a development pause is impractical and not likely to be heeded.
That leaves us with putting safety rules and frameworks in place.
The gold standard is the European Union AI Act, the world’s first comprehensive rules to govern AI risks.
The Act, which is being progressively rolled out, bans AI systems that compromise human safety, dignity and civil liberties. These include AI that scrapes the internet for intrusive facial recognition or social scoring, as well as synthetic voices that impersonate familiar contacts to push high-risk financial loans or extract sensitive personal data. There are also penalties for flouting the Act of up to €35 million (S$52 million) or 7 per cent of a company’s global annual turnover – whichever amount is higher.
While it is a solid piece of legislation, the Act offers little consolation for the simple reason that Europe is hardly a key figure in the AI race.
Any effective guard rails will have to come from the US and China, which are leaders in AI research.
If anything, the US has eased its approach to regulating AI frontier models. It has opted for a voluntary framework in which officials and partners will get a 30-day window to evaluate frontier models before they are publicly launched. Such a framework, while useful, hardly inspires confidence, given the size of the threat flagged by Anthropic. And as far as China goes, even these guard rails are missing.
This leaves humanity more exposed than ever unless tough, EU-style regulations are put in place or AI model evaluations are made mandatory.
Otherwise, humans could lose control over AI.


