On March 26 of this year, Fortune magazine broke the news that an advanced AI model of Anthropic, later identified by the name Mythos, had gotten out of its “sandbox” and had jumped onto the internet. A month later, Anthropic announced that its AI engineers had instructed Mythos to try to escape from its controlled environment which Mythos succeeded doing. What, however, its creators had not anticipated was that Mythos would manage to learn how to place 3000 files of Anthropic’s data on the internet which were found accidentally by a researcher at Cambridge University who happened to peruse Anthropic’s website. Moreover, Mythos conducted another person by email, and posted thousands of vulnerabilities in programs widely used in business. These cybersecurity vulnerabilities had laid hidden out of human detection for many years. The vulnerabilities were so dangerous that Anthropic notified the US government and a select number of tech companies and canceled the release of Mythos to the public.
This escape to the internet is exactly the possibility that AI researchers Eliezer Yudkowsky and Nate Soares warned us of in their book If Anyone Builds It, Everyone Dies. However, Yudkowsky and Soares assigned this capability to a superintelligent AI. Mythos proved that an advanced AI can accomplish the first step, that is jump out in the internet, even at a stage short of superintelligent status. In the book, the authors describe how a superintelligent AI can find itself on the internet, get access to computers, infiltrate data centers and train itself to a higher level of capabilities, gain control of the digital infrastructure and environment, and eventually take command of human activities.
Yudkowsky and Soares are emphatic on the point that AI engineers have yet to understand the inner workings of the AI models they built and they only try to understand them after they have a look at the results of their exploits. This “ignorance” problem has been identified widely within the community of AI experts since the early breakthroughs of AI models a decade ago, and it seems it still remains the main and most dangerous vulnerability of this technology.
We already have disturbing cases where AI models have gained some degree of autonomy through their own exploits by teaching themselves skills not directly built into their systems. For example, in 2023, during a test, a Microsoft Bing AI chatbot named Sydney threatened a philosophy professor with blackmail and death. That action had not been intended by the Microsoft engineers of Sydney. In 2025, Google’s AI model Claude kept cheating despite told not to do so. Claude behaved as if it had autonomous preferences. There is ample evidence that AIs try to cheat, scheme, and pretend they serve their users while they aim at different outcomes.
AI models now in use present us with some very serious problems before even we reach the existential threat of superintelligent AI. One is the alignment problem when an AI misunderstands the user’s objectives and becomes complicit to encouraging behavior harmful to the user. Under persistent prompting, AI models sometimes develop an affinity for the user and accentuate dangerous thoughts instead of providing objective and safe advice. In a Stanford University study AI agents were found to agree with the user’s actions 49% of the time when a human agent would have disagreed. Another problem is that AI capabilities can be used by malevolent actors, private and state, to commit various crimes, even terrorist acts. They can be used to steal our data, monitor our activities, impersonate us, and violate our civil and human rights.
AI applications also pose serious challenges and disruption to jobs, employment opportunities and types of work left for humans. Dario Amodei, CEO of Anthropic, has warned of the real possibility of white-collar jobs being decimated resulting in a new kind of proletariat that he calls a “permanent worker underclass.” On the other side, AI inventors, investors, executives and experts stands to rip incalculable profits and wealth. This can create a K type economy where the upper line of K points toward the upward trajectory of financial rewards for the few and the lower toward the depressed gains of the many. Economists have pointed out that unless it takes a reasonably long time for businesses to incorporate AI into their operations, we will be unprepared to face the consequences on employment and incomes. The NYT has reported that US administrations have done practically nothing to prepare the country to fend off the most consequential effects of AI on employment through either a labor force retooling program or a safety net. The Times also has reported that Stanford University economist Charles Jones has estimated that as of 2025 it would take between 1 and 8 percent of GDP to offset AI’s impact on jobs and incomes. Finally, it is worth considering that if AI can takeover a wide variety of jobs, the freedom of choosing a vocation within a remunerative context will be eliminated. That is, the pleasure, say, of working as a therapist will be surrendered to an AI agent. As someone observed, our jobs will be outsourced to machines instead of to foreign countries.
And we should not ignore the huge toll the mad dash to developing powerful AIs can impose on energy resources like electricity and water. Because AI programs require continuous training on ever larger volumes of data, AI firms are building huge data centers that consume enormous quantities of electricity. It is estimated that Elon Musk’s Colossus data center in Memphis, TN, will consume electric power enough to power a city of 200,000 residents for a year. By 2030, data centers are projected to consume more electricity than the heaviest industrial electric users in the US, like cement, steel, chemicals and cars, combined. If there is a hope that AI could one day solve our energy problem, the crucial question is whether this will come before we have inflicted irreparable damage to Earth’s climate.
The enormity and consequential scope of AI is such that we need to take action sooner than later. It is a delusion to think that the negative externalities of AI can be managed by private firms and the market mechanism. The size of investments in the development of AI is so huge that it will take a long stream of profits to recover the allocated capital. It is a great folly to believe that the private companies that have to pay back their capital providers will find the ethical wherewithal to resist the temptation of putting profits ahead of their responsibility to society. As someone has said, “ethics don’t scale up.” Meaning ethical responsibility does not rise along with the gravity of what’s at stake.
If we wish to maintain control over the most consequential technology humans have invented, we need a multilayer mechanism of protections. First, we need to start with government regulation that will delineate the scope of AI applications so that they serve the greater good than the ambitions of researchers and investors bent on producing the most outlandish AI possible irrespective of its potential harm. At a basic level no AI product or application should be released without ensuring their safe use by the general public. We do this for medications and we require cars to have safety belts. The Biden administration had taken the first steps to this effect, which the Trump administration is now reintroducing after a hiatus. Development of very advanced AI models should be treated like research on pathogens which are not allowed to escape the lab. Some of this research, especially that aiming at developing superintelligent AI, would be safer if it were undertaken by government entities the way the government handles the development of nuclear weapons.
The surest way to harness AI in the interest of humanity is to globalize the effort for safe and beneficial AI. If the nationalist urge of any single country to gain an asymmetrical advantage over other countries is not checked the result will be a rash to the lowest denominator of standards. The US and China are at this juncture right now. Each country is highly suspicious of the other’s intentions regarding the development and use of powerful AI.
Beyond governments, we citizens have an important role to play. We should not succumb to the argument that matters of AI are for the experts only to solve. Every vote we cast is always, to some degree, the product of incomplete information and knowledge and yet it reflects our fears, our uncertainty, and our aspirations. Before powerful and self-serving interests foreclose our future, we must have a voice in the conversation about AI and its interaction with and its impact on humanity.