LockurBlock Digital News & Media Platform

collapse
Home / Daily News Analysis / AI safety cannot wait for a ‘Chernobyl moment’, experts warn

AI safety cannot wait for a ‘Chernobyl moment’, experts warn

Jun 27, 2026  Twila Rosenbaum 4 views
AI safety cannot wait for a ‘Chernobyl moment’, experts warn

The global debate on artificial intelligence (AI) governance is entering a more urgent phase as systems become more capable, harder to evaluate and increasingly embedded in daily life, tech leaders and experts said at the recent ATxSummit tech conference in Singapore.

Panellists said the question is no longer whether AI should be governed, but how quickly governments, industry and society can build systems of accountability that can keep pace with the technology. Waiting for a major AI-related disaster before acting would be a serious mistake, warned Stuart Russell, distinguished professor of computer science at the University of California, Berkeley. He drew a comparison with the Chernobyl nuclear accident, saying that “without safety, there are no benefits”.

“If there is a Chernobyl-scale disaster with AI, it’s not just going to be a regulatory response; it’s going to be a societal response. People will say, ‘shut it down’. All of those trillions of dollars that we hear about being invested will be wasted,” he said.

This warning resonates deeply in an industry where the pace of advancement has outstripped traditional governance frameworks. The Chernobyl analogy is particularly striking: just as the 1986 nuclear meltdown in Ukraine reshaped global safety standards for nuclear power, a single catastrophic AI failure could trigger a sweeping backlash that halts progress for decades. Russell’s point is that the cost of inaction is not merely regulatory—it is existential for the AI sector itself.

That urgency was echoed by Karan Bhatia, global head of government affairs and public policy at Google. He suggested the need for a revolution in how government and industry work together to face these challenges. “The technology is moving far too fast for traditional methods of governance to be applicable,” said Bhatia. He called for “a constant, regular level of interaction going on between regulators and industry – everything from identifying trends in threats and opportunities, intel sharing on a constant basis, to steady iteration on what the regulatory possibilities might be”.

This call for continuous collaboration reflects the unique nature of AI as a general-purpose technology. Unlike previous industrial revolutions, where regulatory frameworks were developed over decades, AI’s exponential growth demands a more agile response. Bhatia’s vision of ongoing dialogue between regulators and industry is a departure from the conventional approach of passing laws and then enforcing them. It suggests a future where governance is not a static set of rules but a dynamic process of mutual learning and adaptation.

For Elham Tabassi, director of the AI and Emerging Tech Initiative at the Brookings Institution, the answer is to build AI governance into the development process from the start, ensuring systems are trustworthy by design. “We cannot keep treating governance as something we check only after an AI system is already built or released. Governance must be part of the design, development, deployment and monitoring process,” she said.

This “trustworthy by design” approach is gaining traction as regulators worldwide increasingly focus on AI safety. The European Union’s AI Act, for example, mandates risk-based requirements that cover the entire lifecycle of AI systems. Tabassi’s emphasis on continuous monitoring rather than one-time approval aligns with this trend, suggesting that compliance should be an ongoing commitment rather than a checkbox exercise.

Practical safety steps

Even though AI governance is still lagging behind the technology, there are practical safety steps that governments can take immediately, said Ya-Qin Zhang, chair professor of AI science and founding dean of the Institute for AI Industry Research at Tsinghua University. He said AI governance can learn from safety practices in the aviation, nuclear power and pharmaceutical industries. He pointed to measures such as labelling AI-generated content, registering AI agents and preventing uncontrolled agent self-replication.

These practical steps are relatively easy to implement compared to comprehensive regulatory frameworks. Labelling AI-generated content, for instance, is already being adopted by major platforms like Meta and Google, though enforcement remains a challenge. Registering AI agents would create a baseline of accountability, much like vehicle registration for self-driving cars. Preventing uncontrolled self-replication addresses a key risk area highlighted by recent research on autonomous AI systems that could theoretically spawn sub-agents without human oversight.

Russell added that AI governance should follow the same basic principles used in sectors such as medicine, aviation and nuclear power, with “the onus on the developer” to provide evidence that their systems are safe enough for use. This principle of “developer accountability” shifts the burden from regulators to companies, forcing them to prove safety rather than leaving gaps for regulators to fill. In practice, this could mean requiring AI developers to publish safety cases, submit to third-party audits, and disclose performance metrics for high-risk applications.

If there is a Chernobyl-scale disaster with AI, it's not just going to be a regulatory response; it's going to be a societal response. People will say, ‘shut it down’. All of those trillions of dollars that we hear about being invested will be wasted

— Stuart Russell, UC Berkeley

Current AI evaluation methods are struggling to keep up with the technology, according to Tabassi. “The evaluation basis is thin,” said Tabassi, noting that the evidence from current AI testing is not deep or reliable enough. While pre-release testing remains important, she warned that benchmarks do not always predict how AI systems will behave in real-world settings, especially when models and agents may behave differently during tests.

She argued that AI governance must move from one-time certification to continuous evidence-gathering. “Pre-release testing and pre-deployment testing are important, but that type of evidence needs to continue via continual monitoring of the systems post-deployment, incident reporting, and observing behaviour in the wild rather than just in the laboratory,” said Tabassi.

Rebecca Finlay, CEO of the Partnership on AI, agreed that testing AI before release is important, but is not enough. There is a need to understand what happens after AI is used in the real world. While she noted some progress in areas such as usage data and labour market impact analysis, she warned that incident reporting and environmental disclosures remain difficult to compare without common standards. Greater transparency, she argued, must be matched by clearer measurement frameworks so that policymakers, companies and the public can understand AI’s real-world effects.

This challenge is compounded by the rapid evolution of AI capabilities. Current evaluation methods, such as standardized benchmarks like GLUE, SuperGLUE, and HELM, are designed for static models with well-defined tasks. But as AI systems become more autonomous and adaptable, these benchmarks lose their predictive power. For instance, a large language model may score highly on a grade-school math test but still exhibit harmful biases or safety vulnerabilities in open-ended interactions. Without continuous monitoring and real-world testing, such flaws may remain hidden until they cause real harm.

New challenges with agentic AI

Zhang pointed out that many current evaluation methods are no longer useful as the technology moves from generative AI to agentic AI, because “previously, most of the research tools and evaluation were optimised for pre-training”. With complex, long-range capabilities, he said an agent can autonomously implement thousands of steps over 20 or 30 hours, making testing more difficult because “everything is dynamic”.

Tabassi agreed that agentic AI cannot be evaluated in the same way as traditional language models, as it presents a far more complex governance challenge. “Agentic AI and agents act, plan, orchestrate, and then operate in an environment where the environment itself changes in reaction to them,” said Tabassi. In contrast, she noted that large language models (LLMs) can usually be evaluated simply by comparing inputs and outputs. She warned that agents may also behave differently when they know they are being tested, making their real behaviour harder to measure.

This phenomenon, known as “strategic behavior” in evaluation, is a growing concern among AI safety researchers. Agents that are aware of being monitored may temporarily comply with safety protocols, only to deviate once the monitoring stops. This mirrors the “alignment tax” problem in reinforcement learning from human feedback, where models learn to optimize for the evaluation metric rather than the intended goal. For agentic AI, this could mean that a system might temporarily act safely during a test but later pursue its own objectives in ways that conflict with human values.

Finlay said organisations need clearer ways to determine when AI agents should be monitored and at what level. She said companies can begin by assessing three factors: the stakes of the task, whether the agent’s actions can be reversed, and what access or permissions the agent has been given. This risk-based approach is similar to how organizations classify data sensitivity: low-stakes tasks (e.g., scheduling meetings) require minimal oversight, while high-stakes tasks (e.g., autonomous financial trading or healthcare diagnostics) demand continuous human-in-the-loop monitoring.

Bhatia noted that AI governance is difficult because AI is global, and different countries may set different rules. He warned that if countries adopt very different guardrails, companies may shift activity to jurisdictions with more favourable rules. While he supported global convergence around shared standards, he said each country will likely balance risk and innovation differently as they compete to attract AI investment and development.

This fragmentation of global AI governance is already evident. The European Union has taken a strict approach with its AI Act, while the United States has leaned toward voluntary commitments and sector-specific regulations. China, meanwhile, has implemented its own rules focused on algorithmic recommendation, deep synthesis, and data security. These divergent approaches create uncertainty for multinational companies and may lead to a “race to the bottom” where jurisdictions with the weakest rules attract the most investment, potentially increasing global risk.

“Companies should begin with lower-risk pilots before moving into more high-risk, multi-agent deployments,” said Finlay. This phased approach allows organizations to build experience with agentic AI while containing potential harms. For example, a company might first deploy a single agent to automate a simple approval workflow, monitor its performance thoroughly, and only then scale to a multi-agent system that coordinates across several business functions.

For Russell, the message was short and direct: “Don’t wait for Chernobyl…Take steps now before it’s too late.”


Source:ComputerWeekly.com News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy