UCL Psychology and Language Sciences


Speech Perception and Production

Speech perception and production image

The major themes of our current research are:
- Learning development and plasticity for first and second languages
- Speaker-listener interaction in communicative discourse
- Prosody, focusing on computational modelling, emotional prosody and acquisition
- Cognitive and neural mechanisms underlying the robustness and flexibility of spoken language processing
- Sociophonetics

Currently Funded Projects

Completed Projects 


Key researchers:


Patti Adank

My research focuses on the cognitive and neural mechanisms underlying the robustness and flexibility of human spoken language The acoustic speech signal is inherently variable, for instance due to background noise, differences in speakers’ anatomy and physiology, speaking style, regional or socio-economic background, or language background. Yet speech perception remains remarkably stable, and listeners are even able to quickly adapt to novel variation sources of the acoustic signal, such as a speaker’s foreign or regional accent.

I combine behavioural and neuroimaging research. My behavioural research consists of studies on speech perception and production  and my neuroimaging research focuses on the neural bases of on-line adaptation in spoken language comprehension and production. Past neuroimaging studies have reported the involvement of neural bases for speech production in speech comprehension tasks. However, this involvement is only present under adverse listening conditions, such as speech in noise, or when the signal has been distorted, for instance by time-compressing it. In my research I test the possibility that this involvement of production regions is specific to on-line adaptation and learning and underlies the robustness of human speech comprehension. My present and future research uses traditional psycholinguistic paradigms, and functional neuroimaging methods such as fMRI, Transcranial Magnetic Stimulation.

Recently, I got interested in the role of vocal imitation in speech and am exploring the possibility that both overt and covert imitation may optimises speech perception.

Meet the researcher:




Bronwen Evans

My research focuses on variability in the speech signal. By combining theory and methodologies from the disciplines of sociophonetics, experimental phonetics and speech perception, my work aims to address how listeners adjust perceptually to variation within their own language. I am also interested in the relationship between speech production and perception, and how speakers adapt their speech production during interaction and over their life span. My work in sociophonetics also draws on ideas from second language learning, where my work has focused on investigating the development of phonetic perception in second language learners.

Meet the researcher:



Andrew Faulkner (Emeritus)

Hearing: especially pitch perception; psychoacoustics of impaired hearing; hearing with a cochlear implant Speech perception: especially perception of degraded and distorted speech; effects of hearing impairment and cochlear implants.

Valerie Hazan (Emeritus)

Recently, most of my research has been concerned with speaker/listener interaction in speech communication, i.e. how speakers adapt the characteristics of their speech production to achieve effective communication in good and poor listening conditions. In a series of three ESRC-funded projects, we have investigated the acoustic-phonetic characteristics of speaker-listener interactions in young adults, children with normal and impaired-hearing and older adults. The accent has been on ‘clear speech’ adaptations made in challenging communicative conditions.

I generally have an interest in issues of within- and between-speaker variability in speech perception and production and in the development of speech perception and production in typical and atypical populations.

I have also carried out research on speech perception in second-language learners, on the effectiveness of auditory-visual phonetic training and on audiovisual speech perception.



Paul Iverson

My work primarily examines plasticity for speech perception (e.g., how the perceptual processes of individuals adapt as they learn their first language in childhood, learn additional language as adults, encounter unusual accents, use auditory prostheses such as cochlear implants, or understand speech under noisy conditions). My aims have been to investigate where and how adaptations are made in the speech processing pathway (e.g., in auditory, phonetic, phonological, or lexical processing), as well as how plasticity is altered by age and prior experience.

At present, much of my work examines these issues using EEG for both adults and infants, with a particular focus on accents and language learning. I've also been active in developing new training techniques to improve second-language phoneme perception, particularly using mobile devices such as the iPhone.

I also have research interests in music perception (e.g., musical timbre and auditory stream segregation).

Stuart Rosen

My research is broadly-based in hearing and speech, with an emphasis on the interface between the two. Over the years I have studied a variety of aspects of auditory perception (speech and non-speech) in adults and children, both in disordered and normal populations. I initially came to UCL to work on cochlear implants, and an important part of my current work still concerns this. One strand concerns the optimisation of the transmission of voice melody information in multi-channel implants and a current project for optimal combination of electrical stimulation in one ear with residual hearing in the other (drawing heavily on the expertise developed concerning the auditory processing of people with profound hearing impairment). A separate offshoot of this work concerns the adaptation of cochlear implant users to their new kind of hearing. Incomplete electrode insertion means that patients experience an upward spectral shift in auditory information, and there have been claims that this may be crucial in limiting implant patient performance. Our team was the first to show that people can adapt readily to such spectral shifts, with many further studies both concerning cochlear implants in particular, but also the nature of perceptual adaptation. I spent many years investigating some of the most basic aspects of auditory processing and consideration of the auditory filtering properties of hearing-impaired listeners led our group to the development of a simple description of auditory filtering as a function of level that clarified the nature of the nonlinearity both in normal and impaired hearing. My current interests in this area are more concerned with other decompositions of auditory information, in particular into fine-structure and envelope information. I proposed a classification of the temporal properties of the speech wave more than 15 years ago, and I am still developing various aspects of it. Over the last 10 years, much of my work has shifted focus to more central auditory processes. I am involved in studies of functional brain imaging (PET and fMRI), in attempts to determine the neural substrates for speech and nonspeech processing. I have also been investigating auditory processing in people with specific language impairment and dyslexia, arguing that although auditory processing deficits appear to be more common in such groups, they do not appear to play a causal role in the language deficits Related work concerns the notion of auditory processing disorder and its implications for development.

Meet the researcher:




Yi Xu

My research is primarily concerned with the basic mechanisms of speech production and perception in connected discourse in general, and speech prosody in particular. My work also concerns computational modeling and automatic synthesis of speech, computational modeling of the neural process of speech acquisition and emotions in speech.

Christopher Carignan

Perhaps the most defining characteristic of our species is the complexity of speech to communicate meaning. Through muscular control of a relatively small portion of the body (the vocal tract), a speaker is able to modify the vibration of air molecules as a vessel for transmitting a mental concept to a listener. My research involves using a wide variety of state-of-the-art technologies (real-time MRI, ultrasound tongue imaging, electromagnetic articulometry, nasalance, laryngography) to investigate how speakers coordinate vocal tract articulators to produce speech sounds, how this shaping of the vocal tract affects the acoustic output, and how these acoustic changes are perceived by listeners. Knowledge of these aspects of speech production and perception can help explain sound patterns that we observe as languages evolve over time, predict future language evolution, and teach us about the physical and cognitive characteristics of our shared capacity for human language.

Carolyn McGettigan
I'm an experimental psychologist and cognitive neuroscientist researching the perception and production of the human voice.
With funding from the Economic and Social Research Council. The Royal Society, and The Leverhulme Trust, research in my lab has explored the psychology and neuroscience of voice processing. Behavioural and neuroimaging studies have addressed the perception of vocal identity, traits, and emotions, including how these are recognised and learned. Using combined MRI of the vocal tract and the brain, we also study the neural bases of flexible sensorimotor control for speech and voice production.
Emma Holmes

I’m interested in how we percieve sounds in challenging listening environments—such as understanding what a friend's saying when there are other conversations going on around us. In particular, I'm interested in how auditory cognition (e.g., attention and prior knowledge) affects our perception of speech and other sounds, and how these processes are affected by hearing loss. My research combines behavioural techniques (e.g., auditory psychophysics), cognitive neuroscience (e.g., EEG, MEG, and fMRI), and computational modelling.

Josef Schlittenlacher

I'm interested in auditory perception and computational models of the underlying neuroscientific mechanisms and for applied, realistic scenarios. With speech being the most important sound for humans, the biggest part of my research revolves around speech perception. I use and develop deep learning and other machine learning methods to help answer my research questions, as part of computational models and for new applications in the field.