Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model.

Automatic speech recognition Depression Smartphone Speech Topic modeling

Journal

Journal of affective disorders
ISSN: 1573-2517
Titre abrégé: J Affect Disord
Pays: Netherlands
ID NLM: 7906073

Informations de publication

Date de publication:
27 Mar 2024
Historique:
received: 26 09 2023
revised: 18 03 2024
accepted: 22 03 2024
medline: 30 3 2024
pubmed: 30 3 2024
entrez: 29 3 2024
Statut: aheadofprint

Résumé

Prior research has associated spoken language use with depression, yet studies often involve small or non-clinical samples and face challenges in the manual transcription of speech. This paper aimed to automatically identify depression-related topics in speech recordings collected from clinical samples. The data included 3919 English free-response speech recordings collected via smartphones from 265 participants with a depression history. We transcribed speech recordings via automatic speech recognition (Whisper tool, OpenAI) and identified principal topics from transcriptions using a deep learning topic model (BERTopic). To identify depression risk topics and understand the context, we compared participants' depression severity and behavioral (extracted from wearable devices) and linguistic (extracted from transcribed texts) characteristics across identified topics. From the 29 topics identified, we identified 6 risk topics for depression: 'No Expectations', 'Sleep', 'Mental Therapy', 'Haircut', 'Studying', and 'Coursework'. Participants mentioning depression risk topics exhibited higher sleep variability, later sleep onset, and fewer daily steps and used fewer words, more negative language, and fewer leisure-related words in their speech recordings. Our findings were derived from a depressed cohort with a specific speech task, potentially limiting the generalizability to non-clinical populations or other speech tasks. Additionally, some topics had small sample sizes, necessitating further validation in larger datasets. This study demonstrates that specific speech topics can indicate depression severity. The employed data-driven workflow provides a practical approach for analyzing large-scale speech data collected from real-world settings.

Sections du résumé

BACKGROUND BACKGROUND
Prior research has associated spoken language use with depression, yet studies often involve small or non-clinical samples and face challenges in the manual transcription of speech. This paper aimed to automatically identify depression-related topics in speech recordings collected from clinical samples.
METHODS METHODS
The data included 3919 English free-response speech recordings collected via smartphones from 265 participants with a depression history. We transcribed speech recordings via automatic speech recognition (Whisper tool, OpenAI) and identified principal topics from transcriptions using a deep learning topic model (BERTopic). To identify depression risk topics and understand the context, we compared participants' depression severity and behavioral (extracted from wearable devices) and linguistic (extracted from transcribed texts) characteristics across identified topics.
RESULTS RESULTS
From the 29 topics identified, we identified 6 risk topics for depression: 'No Expectations', 'Sleep', 'Mental Therapy', 'Haircut', 'Studying', and 'Coursework'. Participants mentioning depression risk topics exhibited higher sleep variability, later sleep onset, and fewer daily steps and used fewer words, more negative language, and fewer leisure-related words in their speech recordings.
LIMITATIONS CONCLUSIONS
Our findings were derived from a depressed cohort with a specific speech task, potentially limiting the generalizability to non-clinical populations or other speech tasks. Additionally, some topics had small sample sizes, necessitating further validation in larger datasets.
CONCLUSION CONCLUSIONS
This study demonstrates that specific speech topics can indicate depression severity. The employed data-driven workflow provides a practical approach for analyzing large-scale speech data collected from real-world settings.

Identifiants

pubmed: 38552911
pii: S0165-0327(24)00530-5
doi: 10.1016/j.jad.2024.03.106
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

Copyright © 2024 The Author(s). Published by Elsevier B.V. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest S.V. and V.A.N. are employees of Janssen Research and Development LLC. M.H. is the principal investigator of the Remote Assessment of Disease and Relapse–Central Nervous System project, a private public precompetitive consortium that receives funding from Janssen, UCB, Lundbeck, MSD, and Biogen.

Auteurs

Yuezhou Zhang (Y)

King's College London, London, UK. Electronic address: yuezhou.zhang@kcl.ac.uk.

Amos A Folarin (AA)

King's College London, London, UK; University College London, London, UK; South London and Maudsley NHS Foundation Trust, London, UK; Health Data Research UK London, University College London, London, UK.

Judith Dineley (J)

King's College London, London, UK; University of Augsburg, Augsburg, Germany.

Pauline Conde (P)

King's College London, London, UK.

Valeria de Angel (V)

King's College London, London, UK.

Shaoxiong Sun (S)

King's College London, London, UK.

Yatharth Ranjan (Y)

King's College London, London, UK.

Zulqarnain Rashid (Z)

King's College London, London, UK.

Callum Stewart (C)

King's College London, London, UK.

Petroula Laiou (P)

King's College London, London, UK.

Heet Sankesara (H)

King's College London, London, UK.

Linglong Qian (L)

King's College London, London, UK.

Faith Matcham (F)

King's College London, London, UK; School of Psychology, University of Sussex, Falmer, East Sussex, UK.

Katie White (K)

King's College London, London, UK.

Carolin Oetzmann (C)

King's College London, London, UK.

Femke Lamers (F)

Department of Psychiatry and Amsterdam Public Health Research Institute, Amsterdam UMC, Vrije Universiteit, Amsterdam, the Netherlands.

Sara Siddi (S)

Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Universitat de Barcelona, Barcelona, Spain.

Sara Simblett (S)

King's College London, London, UK.

Björn W Schuller (BW)

University of Augsburg, Augsburg, Germany.

Srinivasan Vairavan (S)

Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Universitat de Barcelona, Barcelona, Spain.

Til Wykes (T)

King's College London, London, UK; South London and Maudsley NHS Foundation Trust, London, UK.

Josep Maria Haro (JM)

Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Universitat de Barcelona, Barcelona, Spain.

Brenda W J H Penninx (BWJH)

Amsterdam University Medical Centre, Vrije Universiteit and GGZ inGeest, Amsterdam, Netherlands.

Vaibhav A Narayan (VA)

Janssen Research and Development LLC, Titusville, NJ, USA; Davos Alzheimer's Collaborative, Geneva, Switzerland.

Matthew Hotopf (M)

King's College London, London, UK; South London and Maudsley NHS Foundation Trust, London, UK.

Richard J B Dobson (RJB)

King's College London, London, UK; University College London, London, UK; South London and Maudsley NHS Foundation Trust, London, UK; Health Data Research UK London, University College London, London, UK.

Nicholas Cummins (N)

King's College London, London, UK. Electronic address: nick.cummins@kcl.ac.uk.
www.radar-cns.org.

Classifications MeSH