Warning: Humans cannot reliably detect speech deepfakes. 2023

Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
Department of Security and Crime Science, University College London, London, United Kingdom.

Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising from progress in artificial intelligence due to their potential for misuse. However, studies investigating human detection capabilities are limited. We presented genuine and deepfake audio to n = 529 individuals and asked them to identify the deepfakes. We ran our experiments in English and Mandarin to understand if language affects detection performance and decision-making rationale. We found that detection capability is unreliable. Listeners only correctly spotted the deepfakes 73% of the time, and there was no difference in detectability between the two languages. Increasing listener awareness by providing examples of speech deepfakes only improves results slightly. As speech synthesis algorithms improve and become more realistic, we can expect the detection task to become harder. The difficulty of detecting speech deepfakes confirms their potential for misuse and signals that defenses against this threat are needed.

UI MeSH Term Description Entries
D007802 Language A verbal or nonverbal means of communicating ideas or feelings. Dialect,Dialects,Languages
D010700 Phonetics The science or study of speech sounds and their production, transmission, and reception, and their analysis, classification, and transcription. (Random House Unabridged Dictionary, 2d ed) Speech Sounds,Sound, Speech,Sounds, Speech,Speech Sound
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D001185 Artificial Intelligence Theory and development of COMPUTER SYSTEMS which perform tasks that normally require human intelligence. Such tasks may include speech recognition, LEARNING; VISUAL PERCEPTION; MATHEMATICAL COMPUTING; reasoning, PROBLEM SOLVING, DECISION-MAKING, and translation of language. AI (Artificial Intelligence),Computer Reasoning,Computer Vision Systems,Knowledge Acquisition (Computer),Knowledge Representation (Computer),Machine Intelligence,Computational Intelligence,Acquisition, Knowledge (Computer),Computer Vision System,Intelligence, Artificial,Intelligence, Computational,Intelligence, Machine,Knowledge Representations (Computer),Reasoning, Computer,Representation, Knowledge (Computer),System, Computer Vision,Systems, Computer Vision,Vision System, Computer,Vision Systems, Computer
D013060 Speech Communication through a system of conventional vocal symbols. Public Speaking,Speaking, Public
D013067 Speech Perception The process whereby an utterance is decoded into a representation in terms of linguistic units (sequences of phonetic segments which combine to form lexical and grammatical morphemes). Speech Discrimination,Discrimination, Speech,Perception, Speech

Related Publications

Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
November 2021, iScience,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
June 2005, Stroke,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
January 2018, Annals of thyroid research,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
April 1985, Archives of disease in childhood,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
January 1985, Archives of disease in childhood,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
April 2007, Journal of biomedical materials research. Part B, Applied biomaterials,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
June 2006, British journal of anaesthesia,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
March 2006, British journal of anaesthesia,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
July 2013, Circulation. Cardiovascular quality and outcomes,
Kimberly T Mai, and Sergi Bray, and Toby Davies, and Lewis D Griffin
January 2006, Journal of biomechanics,
Copied contents to your clipboard!