Deception Abilities Emerged in Large Language Models

Invest in Humankind

From Ethan Mollick on Paper:

There are two interesting things about this paper on AIs and deception.

First, the paper shows GPT-4 is capable of very complex active deception (“predicting” what multiple parties might be thinking in order to deceive a potential robber, see the example). Plus, deception skills increase with chain-of-thought prompting or Machiavellian priming.

Second, the paper does a really good job of showing what good methods look like for “Can AI do this?” questions:
1) Used multiple models, including the most advanced open and private models. Described exactly what models it used.
2) Ran prompts repeatedly, hundreds of times, to measure variation in outputs
3) Created permutations of each prompt and permutations of prompt order (1,920 total variations)
4) Reported temperatures (a factor that tells you how much randomness was inserted into answers)
5) Used multiple human raters to provide checks on the data

Paper: Deception Abilities Emerged in Large Language Models by Thilo Hagendorff

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Shifting Dynamics in AI: Altman’s Departure and the Future of OpenAI and Generative Technology

Sam Altman, former head of Y Combinator and a notable figure in the entrepreneurial and investment sphere, has been a prominent advocate for generative AI. His world tour this year placed him at the forefront of this technological wave. After OpenAI’s recent announcement, Altman reflected on his impactful tenure at the company through a post on a social platform, expressing gratitude for his experiences and hinting at future endeavors.