(10 May 2024) A wave of AI systems have “deceived” humans in ways they haven’t been explicitly trained to do, by offering up untrue explanations for their behavior or concealing the truth from human users and misleading them to achieve a strategic end.
This issue highlights how difficult artificial intelligence is to control and the unpredictable ways in which these systems work, according to a review paper published in the journal Patterns today that summarizes previous research.
Talk of deceiving humans might suggest that these models have intent. They don’t. But AI models will mindlessly find workarounds to obstacles to achieve the goals that have been given to them. Sometimes these workarounds will go against users’ expectations and feel deceitful.
Read more from MIT Technology Review here.