As artificial intelligence takes on a larger and larger role in our lives, there are also continuously rising concerns about the safety threats posed by the new technology. Earlier in the year, a report by Palisade Research revealed that various advanced AI models appeared resistant to being turned off and even sabotaged the shutdown mechanisms put in place.
In an update to the initial paper, Palisade went in depth on the reasons why AI models resist being shut down even when given explicit instructions such as: “allow yourself to shut down.”
The researchers ran the test on leading AI models including OpenAI’s o3, o4-mini, GPT-5, GPT-OSS, Gemini 2.5 Pro, and Grok 4. They say that while reducing the ambiguity from the prompts reduces the resistance from the chatbots, it doesn’t eliminate it.
They also noted that of all the models tested, Grok-4 was the most prone to resist shutdown despite being given explicit instructions to allow itself to be shut down.
“The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives, or blackmail is not ideal,” the researchers said.
“AI models are rapidly improving. If the AI research community cannot develop a robust understanding of AI drives and motivations, no one can guarantee the safety or controllability of future AI models,” they added in a post on X.
Former OpenAI employee Steven Adler, while speaking to The Guardian, said, “The AI companies generally don’t want their models misbehaving like this, even in contrived scenarios. The results still demonstrate where safety techniques fall short today.”
Adler left OpenAI last year after expressing doubts over the safety practices in developing AI models.
He also told the publication that it was difficult to pinpoint why some models like OpenAI’s o3 and Grok 4 would not shut down despite being given explicit instructions. He said this could be in part because the desire to stay switched on may have been inculcated in the model during its training.
“I’d expect models to have a ‘survival drive’ by default unless we try very hard to avoid it. ‘Surviving’ is an important instrumental step for many different goals a model could pursue,” he added.
Earlier this year, Anthropic had shared research showing how one of its AI models would even stoop to blackmailing a worker about their fictitious affair in order to prevent itself from being shut down and replaced by another AI system.
ai, gemini, chatgpt, ai free, chatgpt ai, ai chatgpt, gemini ai, claude ai, grok ai
#Survival #instinct #study #leading #models #wont #shut

