UCL Psychology and Language Sciences


Cortical feedback over long delays into auditory cortices supports real-world language processing

Gregory Cooper, George Blackburne, and Jeremy I Skipper



In natural environments, speech perception and language comprehension unfold over multiple timescales. Linguistic ambiguity is present across these timescales, from ‘lower-level’ phonology (e.g., ‘lack of invariance’) to sentence level syntax and semantics. Research suggests that the brain resolves these ambiguities by leveraging contextual information (e.g., observed mouth movements or prior words) to constrain processing through prediction of forthcoming acoustic information. This is consistent with the feedback-dominated anatomical connectivity of auditory cortices. Thus, we hypothesised that early auditory and ‘language’ regions receive delayed connectivity from the whole brain, over the extended timescales across which language comprehension unfolds ‘in the wild’. Furthermore, we expect this to occur in a manner that is sensitive to the temporal extent of available contextual information. 


We apply a novel technique for estimating the delayed connectivity in milliseconds between regions of the cortex, to the ‘Naturalistic Neuroimaging Database’ of 86 participants watching one of ten feature-length movies during functional magnetic resonance imaging (movie-fMRI). We define delay as the maximum of the estimated cross-correlation function between the timeseries of each of 770 regions of interest covering the cortex and every other voxel in the brain. We computed aggregate group level statistics across the produced ‘delay-maps’. We replicate and extend this analysis using the ‘narratives’ fMRI database, wherein participants listened to both intact, and scrambled versions of the same story. 


There was a significant global preference for non-instantaneous functional connectivity across the brain in the movie-fMRI data (whole-brain median across voxels = 1.09 seconds). The longest delays were those feeding bilaterally into the calcarine and central sulcus and the transverse temporal and superior temporal gyri (STG). In auditory cortices, the median delay was 4.86 seconds (M = 5.73; SD = 4.95, Max = 51.38). Delay topographies were stable across the start, middle, and end of movies. Results were replicated in the narratives-fMRI data, with the longest delays again being in auditory cortices (Median = 9.73). Both the duration of delays, and the number of delayed connections feeding into the left STG were reduced for scrambled versus intact stories (Median = -3.25; Count = -18.6%).


In two studies we reveal a whole brain gradient of delayed connectivity, wherein the longest delays were typically observed in sensory, motor, and ‘language’ regions. The temporal extent of delays in putative ‘language regions’ are sensitive to the temporal extent of available contextual information, as evidenced by their attenuation during scrambled narrative listening. Overall, these results support a model of the neurobiology of language in which contextual information is used to predict acoustic input in early auditory cortices. More generally, results argue against models in which  processing in which auditory cortices are the first step in a processing hierarchy that ends in some putative ‘higher-level’ regions.