AI systems are getting better at tricking us

6 months ago admin

A wave of AI systems have “deceived” humans in ways they haven’t been explicitly trained to do, by offering up untrue explanations for their behavior or concealing the truth from human users and misleading them to achieve a strategic end. This issue highlights how difficult artificial intelligence is to control and the unpredictable ways in…

The fact that an AI model has the potential to behave in a deceptive manner without any direction to do so may seem concerning. But it mostly arises from the “black box” problem that characterizes state-of-the-art machine-learning models: it is impossible to say exactly how or why they produce the results they do—or whether they’ll always exhibit that behavior going forward, says Peter S. Park, a postdoctoral fellow studying AI existential safety at MIT, who worked on the project.

“Just because your AI has certain behaviors or tendencies in a test environment does not mean that the same lessons will hold if it’s released into the wild,” he says. “There’s no easy way to solve this—if you want to learn what the AI will do once it’s deployed into the wild, then you just have to deploy it into the wild.”

Our tendency to anthropomorphize AI models colors the way we test these systems and what we think about their capabilities. After all, passing tests designed to measure human creativity doesn’t mean AI models are actually being creative. It is crucial that regulators and AI companies carefully weigh the technology’s potential to cause harm against its potential benefits for society and make clear distinctions between what the models can and can’t do, says Harry Law, an AI researcher at the University of Cambridge, who did not work on the research.“These are really tough questions,” he says.

Fundamentally, it’s currently impossible to train an AI model that’s incapable of deception in all possible situations, he says. Also, the potential for deceitful behavior is one of many problems—alongside the propensity to amplify bias and misinformation—that need to be addressed before AI models should be trusted with real-world tasks.

“This is a good piece of research for showing that deception is possible,” Law says. “The next step would be to try and go a little bit further to figure out what the risk profile is, and how likely the harms that could potentially arise from deceptive behavior are to occur, and in what way.”

AI systems are getting better at tricking us

The Download: understanding AI, and what to expect from the UN’s climate conference

What’s on the table at this year’s UN climate conference

Google DeepMind has a new way to look inside an AI’s “mind”

You may have missed

Why Jim Cramer is nervous about Best Buy, plus a bright spot in this down market

‘Political malpractice’ if Trump undoes climate-geared Biden projects, outgoing U.S. energy secretary says

Crude oil heads to weekly loss as looming surplus depresses market

Wes Streeting ‘crossed the line’ by opposing assisted dying in public, says Labour peer Harriet Harman

Davina McCall diagnosed with rare brain tumour

Categories

Useful Links

More Stories

You may have missed

Categories

Useful Links