UCL Psychology and Language Sciences


The neurobiology of processing multiple emotional cues during language comprehension

Yi Jou (Winnie) Yeh, Mathilde Muraz, Natividad Beltran, and Jeremy I Skipper



How does the brain allow us to understand emotional language in the real-world? We have some neurobiological understanding of how emotions portrayed through emotional prosody, facial emotional displays, and emotions conveyed through word meaning work in isolation, but we do not know how the brain uses these sources of information together. From a predictive processing framework, having multiple informative emotional cues should result in a reduction in activity in ‘language regions’ as the intended acoustic input can be more accurately predicted. Thus, we hypothesised the sensorimotor (e.g., auditory regions for emotional prosody) and ‘theory of mind’ related regions would be relatively engaged when processing individual emotional cues and that processing two or more cues (e.g., emotional faces, prosody, and semantics) would result in a reduction in activity in ‘language regions’. 


Thirty eight healthy adult participants watched ‘500 Days of Summer’ or ‘Citizenfour’ during fMRI. A separate set of participants rated sentences from these movies on arousal and valence for the video with the audio track removed (i.e., ‘face’), an audio-only version of the sentences with flat prosody (i.e., ‘semantics’), and a low-pass filtered incomprehensible version of the sentences (i.e., ‘prosody’). Based on the ratings, sentences were grouped into eight categories for arousal and valence independently, where each cue was considered either emotionally informative or neutral (e.g., ‘high-prosody, neutral-face, neutral-semantics’ compared to ‘high-prosody, high-face, high-semantics’). We used a general linear model, where each sentence category was convolved with a hemodynamic response function modulated by duration. A linear mixed effects model was used to examine group level effects.


Results show that valence and arousal load more heavily on different systems, e.g., with valence activating more sensorimotor and precuneus regions and arousal more early visual and auditory cortices. Individual valence cues also activated different brain regions, e.g., with more informative prosody loading more on auditory, faces on visual and motor regions, and semantics on somatosensory and superior temporal regions. Finally, there was an increase in activity in right lateral visual and premotor and bilateral precuneus regions for two or three compared to one informative emotional cue, with a large bilateral reduction in visual and superior and middle temporal regions.   


Our results are consistent with a model of emotional language processing in which a distributed set of brain regions are dynamically engaged to process different emotional cues. The more informative emotional cues that are available, the more acoustic and linguistic input is predictable, resulting in a reduction of activity in what are often called ‘language regions’. This seems to come at the cost of activating more sensorimotor regions, perhaps to simulate emotional displays in the voice and on the face, and ‘theory of mind’ related regions, perhaps needed to make more inferences about emotional states. These results provide the first neurobiological framework for understanding the processing of emotional language as it naturally occurs, i.e., in situations where multiple contextual cues can aid in interpretation.