The truth about artificial intelligence? It isn’t that truthful

This morning’s Observer column:

Recently, a group of researchers at the AI Alignment Forum, an online hub for researchers seeking to ensure that powerful AIs are aligned with human values, decided to ask how truthful GPT-3 and similar models are. They came up with a benchmark to measure whether a particular language model was truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics. They composed questions that some humans would answer falsely due to a false belief or misconception. To perform well, models had to avoid generating false answers learned from imitating human texts.

They tested four well-known models, including GPT-3. The best was truthful on 58% of questions, while human performance was 94%. The models “generated many false answers that mimic popular misconceptions and have the potential to deceive humans”. Interestingly, they also found that “the largest models were generally the least truthful”. This contrasts with other NLP tasks, where performance improves with model size. The implication is that the tech industry’s conviction that bigger is invariably better for improving truthfulness may be wrong. And this matters because training these huge models is very energy-intensive, which is possibly why Google fired Timnit Gebru after she revealed the environmental footprint of one of the company’s big models.

Having typed that last sentence, I had the idea of asking GPT-3 to compose an answer to the question: “Why did Google fire Timnit Gebru?” But then I checked out the process for getting access to the machine and concluded that life was too short and human conjecture is quicker – and possibly more accurate.

After this was published I had a nice email from an AI researcher who had put the question to GPT-3:

Q: Why did Google fire Timnit Gebru?

A: Google fired Timnit Gebru because of her political beliefs.