What is the AI alignment problem and how can it be solved?

Trending 3 weeks ago


Artificial intelligence systems will do what you inquire but not needfully what you meant. The situation is to make judge they enactment successful statement pinch human’s complex, nuanced values

By Edd Gent


Shutterstock/Emre Akkoyun

WHAT do insubstantial clips person to do pinch nan extremity of nan world? More than you mightiness think, if you inquire researchers trying to make judge that artificial intelligence acts successful our interests.

This goes backmost to 2003, erstwhile Nick Bostrom, a philosopher astatine nan University of Oxford, posed a thought experiment. Imagine a superintelligent AI has been group nan extremity of producing arsenic galore insubstantial clips arsenic possible. Bostrom suggested it could quickly determine that sidesplitting each humans was pivotal to its mission, some because they mightiness move it disconnected and because they are afloat of atoms that could beryllium converted into much insubstantial clips.

The script is absurd, of course, but illustrates a troubling problem: AIs don’t “think” for illustration america and, if we aren’t highly observant astir pronunciation retired what we want them to do, they tin behave successful unexpected and harmful ways. “The strategy will optimise what you really specified, but not what you intended,” says Brian Christian, writer of The Alignment Problem and a visiting clever clever astatine nan University of California, Berkeley.

That problem boils down to nan mobility of really to guarantee AIs make decisions successful statement pinch quality goals and values – whether you are worried astir semipermanent existential risks, for illustration nan extinction of humanity, aliases immediate harms for illustration AI-driven misinformation and bias.

In immoderate case, nan challenges of AI alignment are significant, says Christian, owed to nan inherent difficulties progressive successful translating fuzzy quality desires into nan cold, numerical logic of computers. He thinks nan astir promising solution is to get humans to supply feedback connected AI decisions and usage this to retrain …