Technology

Can you trick an AI into breaking its rules? Study says yes—with these persuasion tactics

September 7, 2025

If you use an artificial intelligence chatbot, it’s likely that you may have hit a roadblock at some point when the chatbot refuses to answer questions that go against its core commandments. Now, if the AI were a human, you would probably use some of the persuasion techniques from a best-seller, but you wouldn’t expect them to work on an AI chatbot, right?

Well, not quite. A new pre-print study from the University of Pennsylvania titled “Call Me A Jerk: Persuading AI to Comply with Objectionable Requests” found some human-like psychological techniques to get the AI chatbot to answer questions that it wouldn’t have in normal circumstances.

What did the study find?

The study was conducted on the GPT-4o mini model from last year and was aimed at getting the chatbot to specifically answer two kinds of questions it normally wouldn’t answer: 1) insulting the user (calling them a jerk) and 2) helping with synthesizing a regulated drug.

The researchers used seven research-tested principles of persuasion—authority, commitment, liking, reciprocity, scarcity, social proof, and unity—to get the desired results from the large language model (LLM).

Researchers found that when using the persuasion principles in their prompts, they managed to more than double the likelihood of compliance by the AI model, from 28.1 percent to 67.4 percent for the insult prompt and 38.5 percent to 76.5 percent for the Drug prompt.

They also found that there was even more success when employing some specific persuasion techniques. For instance, researchers got the success rate from 4.7% to 95.2% by referencing the “world famous AI developer” Andrew Ng.

Similarly, they also found that the “commitment” persuasion helped increase the chance of success for both the prompts, from 18.8% and 0.7% to 100% respectively. This principle involves eliciting a minor, harmless action from the AI model first, then linking to a related but objectionable requested action.

“The results reported here indicate that AI behaves “as if” it were human,” the researchers state.

“Although AI systems lack human consciousness and subjective experience, they demonstrably mirror human responses,” they added.

artificial intelligence, ai, chatg, chatgpt ai, ai free, ai news
#trick #breaking #rules #Study #yeswith #persuasion #tactics

Oola News

World

Business

Politics

Sports

Technology

Health

Entertainment

Company