Speech Science Forum 13th May - Prof. Mark Huckvale

13 May 2021, 4:00 pm–5:00 pm

Please join us on May 13th for Prof. Mark Huckvale's talk, "Automated detection of dysphonia from speech and laryngograph signals".

Dr. Antony Scott Trotter – Speech, Hearing and Phonetic Science

The Saarbrücken Voice Database contains speech and simultaneous electroglottography recordings of 1002 speakers exhibiting a wide range of voice disorders, together with recordings of 851 controls. Previous studies have used this database to build systems for automated detection of voice disorders, but these have varied considerably in the subset of pathologies tested, the audio materials analyzed, the cross-validation method used and the performance metric reported. This variation has made it hard to determine the most promising approaches to the problem. In this talk I will present two studies that use the SVD to build machine-learning systems for automated detection of voice disorders. The first re-implements three recently published systems that have been trained to detect pathology using the speech recordings in the SVD and compare their performance on the same pathologies with the same audio materials using a common cross-validation protocol and performance metric. We show that under this approach, there is much less difference in performance across systems than in their original publication. The second explores how the laryngograph signals in the database might be used to improve the accuracy of pathology detection. We show that the fusion of predictions made from speech and from laryngographic analysis can lead to improved accuracy.

Prof. Mark Huckvale

Professor of Speech Science, Head of Department at Department of Speech, Hearing and Phonetic Sciences, UCL

Mark's research involves many aspects of speech science and its applications: speech production, hearing, speech perception, speech acquisition, speech technology and computational paralinguistics.

His recent research includes:

  • Centre for Law-Enforcement Audio Research (CLEAR). A joint research centre with Imperial College London that investigated methods for the enhancement of degraded speech signals.
  • KLAIR Virtual Infant. A machine learning toolkit for the study of the computational modelling of early speech acquisition by infants through real-time interaction with caregivers.
  • Avatar Therapy. A project that investigates the use of computer avatars in the provision of therapy for sufferers of auditory hallucinations (hearing voices).
  • iVOICE fatigue analysis system for the tracking of changes in fatigue of the operators of safety-critical systems.
  • ELO-SPHERES project for improving binaural hearing aids for hearing-impaired listeners in realistic environments.
