A seminar series covering a broad range of applied and methodological topics in Statistical Science.
Talks take place in hybrid format.
Usual time: Wednesdays 14:00-15:00 (to be followed by Departmental Tea in the common room).
Location: For the rest of this academic year, talks will usually take place in B09 (1-19 Torrington Place) and Zoom. Please use the contact information below to join the mailing list, so you will receive location updates and links to the talks.
Contact info: stats-seminars-join at ucl dot ac dot uk
Recent talks
Please subscribe to our YouTube channel, to view some recent talks from the series
Programme for 2023/24 can be found here
Programme for 2024/25
- 9 October 2024: Rob Cornish (University of Oxford) - Stochastic neural network symmetrisation in Markov categories
Abstract:
For data that exhibit symmetries, it is often of interest to parameterise a neural network that is invariant or equivariant with respect to some group actions. Recently there has been interest in doing so via symmetrisation techniques. In contrast to intrinsic methods, which enforce equivariance at each layer of the architecture, these approaches start with a model that is unconstrained and then modify it in some way to become equivariant.
In this talk, I will present my recent paper [1], which provides a general theory of neural network symmetrisation using the framework of Markov categories. Its central result characterises all possible symmetrisation procedures with considerable generality, requiring essentially no assumptions on the groups, actions, and neural network architectures involved. This recovers all existing symmetrisation methods as special cases, and also extends to provide a novel methodology for stochastic models, a problem that had not previously been considered. Moreover, by using Markov categories (with which I will assume no previous familiarity in this talk), the resulting theory becomes highly conceptual: low-level technical details are abstracted away, and significant parts may be expressed visually via string diagrams in a way that more closely resembles a computer implementation. I will also describe some recent follow-on work that applies stochastic symmetrisation at scale for equivariant diffusion modelling, obtaining significant empirical benefits over previous baselines on molecular generation tasks.
- 16 October 2024: Mark van de Wilk (University of Oxford) - Gaussian Processes without Matrix Inverses
There is tantalising evidence that using Gaussian processes (GPs) as a layer in deep models could unlock a range of benefits, from being able to learn architecture through Bayesian evidence maximisation (instead of trial-and-error), to better uncertainty quantification. In the beginning of this talk we will review the promises and obstacles towards this goal. Then we zoom in on the well-known problem of GPs needing to compute matrix decompositions, which makes them unsuited to modern deep learning hardware, which is tailored for low-precision parallel computation. We derive an inverse-free version of the standard sparse variational GP bound by introducing an additional variational parameter that targets the matrix inverse we aim to avoid. We then introduce a novel optimization procedure that is crucial to make the method work. This leads to an inverse-free method that achieves performance comparable with existing approaches on an iteration basis, while offering potential advantages in parallelization and low-precision computation. We hope that this work will pave the way for more efficient and hardware-friendly implementations of Gaussian processes, bringing us closer to the goal of scalable Bayesian deep learning.
We are keen to discuss open problems in the area of connecting (Deep) Gaussian processes to neural networks with other researchers. In particular, we believe that connections based on RKHS inner products, and Stochastic PDE representations of Gaussian processes could be a way forward. Please get in touch if you want to discuss more.
- 23 October 2024: Adrien Corenflos (Warwick University) - High-dimensional inference in state-space models via an auxiliary variable trick
TL;DR: I will show how specific auxiliary-variable tricks allow to unify gradient-based MCMC (e.g. MALA) and sequential Monte Carlo methods (specifically conditional SMC) to get the best of both worlds.
Abstract:
State-space models, also called Hidden Markov models, represent latent, unobserved, Markov chain Xt that can only be accessed through an observation model p(yt | Xt). Because of their latent structure, they exhibit a "decorrelation-over-time", whereby the dependency of the last state XT over the initial one X0 decreases as T increases. This property has been successfully leveraged to obtain particle filtering-inspired efficient (particle) MCMC samplers for this class of models, with a mixing time scaling as, roughly, log(T) in the number of observations. On the other hand, little attention has been given to the scaling of the method in terms of the observation dimension Dy of yt, and particle MCMC methods typically exhibit a mixing time of exp(Dy), mostly because they emulate a form of importance sampling. This has to be compared to the generally much better mixing time, often polynomial, of gradient-based "classical" MCMC samplers.
In this talk, I present an auxiliary-variable technique, successively developed in Finke & Thiery (AOS, 2023), Corenflos & Särkkä (arXiv, 2024+), Corenflos & Finke (arXiv, 2024+), which combines the strengths of both approaches: it leverages the "decorrelation-over-time" property, thereby obtaining the state-of-the-art log(T) mixing time, and uses local (gradient) information to also recover (theorem) the same scaling as RWMH and (empirical evidence) the same scaling as MALA in the dimension Dy of yt. Some time will also be spent on describing methods to incorporate "prior" information into the resulting samplers, à la preconditioned Crank-Nicolson-Langevin.Reference papers:
Finke & Thiery (AOS, 2023) https://doi.org/10.1214/22-AOS2252
Corenflos & Särkkä (arXiv, 2024+) https://arxiv.org/abs/2303.00301
Corenflos & Finke (arXiv, 2024+) https://arxiv.org/abs/2401.14868
- 13 November 2024 - Oliver Dukes (Ghent University) - Nonparametric tests of treatment effect heterogeneity for policy-makers
Recent work has focused on nonparametric estimation of conditional treatment effects, but inference has remained relatively unexplored. We propose a class of nonparametric tests for both quantitative and qualitative treatment effect heterogeneity. The tests can incorporate a variety of structured assumptions on the conditional average treatment effect, allows for both continuous and discrete covariates and does not require sample splitting. Furthermore, we show how the tests are tailored to detect alternatives where the population impact of adopting a personalised decision rule differs from using a rule that discards covariates. The proposal is thus relevant for guiding treatment policies. The utility of the proposal is borne out in simulation studies and a re-analysis of an AIDS clinical trial. This is joint work with Mats Stensrud, Riccardo Brioschi and Aaron Hudson.
- 20 November 2024 - Anne McMunn (UCL Institute of Epidemiology and Health Care) - Gender and Unpaid Care Work as a Social Determinant of Health
for Women & Diversity in Maths This seminar will introduce some of the work of the UCL Research Department of Epidemiology & Public Health on the social determinants of health, including the vast body of evidence on work and job characteristics as important risk factors for chronic illness. In the context of persistent gender inequality in labour market outcomes, it will argue that we need to consider unpaid care work when considering work and health. Gender inequality in unpaid care work is thought to be one of the drivers of gender inequality in labour market participation and career progression, and so potentially limits women’s access to the high-quality employment and socioeconomic attainment that are known to be beneficial to health and wider wellbeing outcomes.
- 27 November 2024 - Maria Fernanda Pintado (Queen Mary University of London) - Bayesian Partial Reduced-Rank Regression
Reduced-rank (RR) regression may be interpreted as a dimensionality reduction technique able to reveal complex relationships among the data parsimoniously. However, RR regression models typically overlook any potential group structure among the responses by assuming a low-rank structure on the coefficient matrix. To address this limitation, a Bayesian Partial RR (BPRR) regression is exploited, where the response vector and the coefficient matrix are partitioned into low- and full-rank sub-groups. As opposed to the literature, which assumes known group structure and rank, a novel strategy is introduced that treats them as unknown parameters to be estimated.
The main contribution is two-fold: an approach to infer the low- and full-rank group memberships from the data is proposed, and conditionally on this allocation, the corresponding (reduced) rank is estimated. Both steps are carried out in a Bayesian approach, allowing for full uncertainty quantification and based on a partially collapsed Gibbs sampler. It relies on a Laplace approximation of the marginal likelihood and the Metropolized Shotgun Stochastic Search to estimate the group allocation efficiently. Applications to synthetic and real-world data report the potential of the proposed method to reveal hidden structures in the data.
- 11 December 2024 - Sarah Heaps (Durham University) - Bayesian inference of sparsity in stationary, multivariate autoregressive processes
Abstract: In many fields, advances in sensing technology have made it possible to collect large volumes of time-series data on many variables. In a diverse array of fields such as finance, genetics and neuroscience, a key question is whether such data can be used to learn directed relationships between variables. In other words, do changes in one variable consistently precede those in another? Graphical vector autoregressions are a popular tool for characterising directed relationships in multivariate systems because zeros in the autoregressive coefficient matrices have a natural graphical interpretation in terms of the implied Granger (non)-causality structure. In many applications, it is natural to assume that the underlying process is stable so that, for example, uncertainty in forecasts does not increase without bound as the forecast horizon increases. Though stationarity is commonly stated as an assumption, it is generally not enforced as a constraint because enforcing stability demands restricting the autoregressive coefficient matrix to lie in a constrained space, with a complex geometry, called the stationary region. However, because the number of parameters in the autoregressive coefficient matrices grow quadratically with dimension, it becomes increasingly difficult to learn, with certainty, that a process is stationary. Working in the Bayesian paradigm, we use a parameter expansion approach to tackle the problem of inference for sparse and stable vector autoregressions by constructing a spike-and-slab prior with support constrained to the stationary region. Computational inference is carried out via a Metropolis-within-Gibbs scheme which uses Hamiltonian Monte Carlo to draw from the full conditional distribution of the continuous parameters. To illustrate our approach to modelling and computational inference, we consider long-term spatio-temporal data on interictal epileptic activity (IEA), which are abnormal brain activity patterns mostly seen in people with epilepsy. Learning about the drivers of variability in this application has the potential to transform epilepsy treatment as the IEA rate is thought to underpin the cognitive deficits in this cohort. In many fields, advances in sensing technology have made it possible to collect large volumes of time-series data on many variables. In a diverse array of fields such as finance, genetics and neuroscience, a key question is whether such data can be used to learn directed relationships between variables. In other words, do changes in one variable consistently precede those in another? Graphical vector autoregressions are a popular tool for characterising directed relationships in multivariate systems because zeros in the autoregressive coefficient matrices have a natural graphical interpretation in terms of the implied Granger (non)-causality structure. In many applications, it is natural to assume that the underlying process is stable so that, for example, uncertainty in forecasts does not increase without bound as the forecast horizon increases. Though stationarity is commonly stated as an assumption, it is generally not enforced as a constraint because enforcing stability demands restricting the autoregressive coefficient matrix to lie in a constrained space, with a complex geometry, called the stationary region. However, because the number of parameters in the autoregressive coefficient matrices grow quadratically with dimension, it becomes increasingly difficult to learn, with certainty, that a process is stationary. Working in the Bayesian paradigm, we use a parameter expansion approach to tackle the problem of inference for sparse and stable vector autoregressions by constructing a spike-and-slab prior with support constrained to the stationary region. Computational inference is carried out via a Metropolis-within-Gibbs scheme which uses Hamiltonian Monte Carlo to draw from the full conditional distribution of the continuous parameters. To illustrate our approach to modelling and computational inference, we consider long-term spatio-temporal data on interictal epileptic activity (IEA), which are abnormal brain activity patterns mostly seen in people with epilepsy. Learning about the drivers of variability in this application has the potential to transform epilepsy treatment as the IEA rate is thought to underpin the cognitive deficits in this cohort.
- 15th January 2025 - Julia Hatamyar (University of York Centre for Health Economics) - Learning control variables and instruments for causal analysis in observational data
Abstract: This study introduces a data-driven, machine learning-based method to detect suitable control variables and instruments for assessing the causal effect of a treatment on an outcome in observational data, if they exist. Our approach tests the joint existence of instruments, which are associated with the treatment but not directly with the outcome (at least conditional on observables), and suitable control variables, conditional on which the treatment is exogenous, and learns the partition of instruments and control variables from the observed data. The detection relies on the condition that proper instruments are conditionally independent of the outcome given the treatment and suitable control variables. We establish the consistency of our method under certain regularity conditions, investigate the finite sample performance through a simulation study, and provide an empirical application.
Authors: Nicolas Apfel, Julia Hatamyar, Martin Huber, Jannis Kueck
- 22nd January 2025 - Sarah Filippi (Imperial College London) - Bayesian variable and group selection approach for high-dimensional data
Abstract: Few Bayesian methods for analysing high-dimensional sparse data provide scalable variable selection, effect estimation and uncertainty quantification. Most methods either sacrifice uncertainty quantification by computing maximum a posteriori estimates, or quantify the uncertainty at high (unscalable) computational expense. In this talk, we will focus on two specific problem settings: (1) variable selection for high dimensional sparse survival data and (2) the selection of groups of variables under a generalised linear model. For both settings, we develop an interpretable and scalable model for prediction and variable or group selection. Our method, based on a variational approximation, overcomes the high computational cost of MCMC whilst retaining the useful features, providing excellent point estimates and offering a natural mechanism for variable/group selection via posterior inclusion probabilities. We derive posterior concentration rates for the group sparse Variational Bayes approach, compare our methods against other state-of-the-art Bayesian variable selection methods on simulated data and demonstrate their application for variable and group selection on real biomedical data.
- 29th January 2025 - Clair Barnes (Imperial College London) - Extreme Event Attribution (and why it matters)
Extreme weather-related events such as heatwaves, floods and wildfires are attracting increasing attention in the media as the effects of climate change become more apparent. Extreme event attribution is an emerging field at the interface between climate science and applied statistics, that aims to quantify the extent to which the frequency and intensity of these events can be said to have been affected by human-induced climate change.
The World Weather Attribution initiative (WWA) is a collaboration between climate scientists and climate impact specialists around the world, working to produce rapid, robust assessments of the role of climate change in the days and weeks directly after an extreme event has occurred, while the impacts are still being felt and -- critically -- before public and media attention has moved on. In this talk I will introduce the broader field of attribution science, with a particular focus on the protocols and statistical methods used by WWA scientists to identify the most impactful events and carry out robust attribution studies on these short timescales, and highlight recent developments and key areas of current research.
- 5th February 2025 - Hera Shi (University of Cambridge) - Conditional Independence Testing in Time Series: A Primer
Abstract: Granger causality has traditionally been studied under the assumption of a linear vector autoregressive (VAR) model, with tests focusing on the significance of the VAR coefficients. We address the problem of testing a model-free null hypothesis of conditional independence in time series—specifically, whether $Y_{t+1}$ and $X_t$ are conditionally independent given the history of $Y$ up to time $t$. We propose nonlinearly regressing both of them on the history of $Y$ up to time $t$, and calculating a test statistic based on the sample covariance of residuals, called the Generalized Temporal Covariance Measure (GTCM). The type I error control of the test relies on the relatively weak assumption that user-chosen regression procedures estimate conditional means at a sufficiently fast rate that is slow enough to accommodate nonparametric settings. By further assuming stability of the regression procedures and weak dependence in the time series, we can utilize the entire dataset to estimate the conditional means without splitting the time series into subsets.
- 12th February 2025 - Marton Balazs (University of Bristol) - Road layout in the KPZ class
Abstract: In this talk I will be after a model for road layouts. Imagine a Poisson process on the plane for start points of cars. Each car picks an independent random direction and goes straight that way for some distance. I will start with showing that the origin (my house, that is) will see a lot of car traffic within an arbitrary small distance.
Which is not what we find in the real world out there, why? The answer is of course coalescence of paths in the random environment provided by hills and other geographic or societal obstacles. This points us towards first passage percolation (FPP), expected to be a member of the KPZ universality class. Due to lack of results for FPP, instead we built our model in exponential last passage percolation (LPP), known to be in the KPZ class. I will introduce LPP, then explain how to construct our road layout model in LPP, and what phenomena we can prove about roads and cars in this model using results from the LPP literature.
(joint with Riddhipratim Basu, Sudeshna Bhattacharjee, Karambir Das, David Harper)- 19th February 2025 - Anastasia Papavasiliou (University of Warwick) - An inverse function theorem for Ito maps, with application to statistical inference for Random Rough Differential Equations
Abstract: Our goal is to develop a general framework for performing statistical inference for discretely observed random rough differential equations. The first step in our approach is to solve the `inverse problem' under the assumption of continuous observations, i.e. construct a geometric p-rough path X whose response Y , when driving a rough differential equation, matches the observed trajectory y. We call this the continuous inverse problem and start by rigorously defining its solution. We then develop a framework where the solution can be constructed as a limit of solutions to appropriately designed discrete inverse problems, so that convergence holds in p-variation. Our approach is based on calibrating the rough path approximations whose limit defines the rough path ‘lift’ of X to the observed trajectory y. A core part in the construction is the development of a type of inverse function theorems for Ito maps.
- 26th February 2025 - Minmin Wang (University of Sussex) - Random bipartite graphs with i.i.d. weights
Abstract: We propose a model of random bipartite graphs with i.i.d. weights assigned to both parts of the vertex sets. Edges are formed independently with probabilities that depend on these weights. Part of the appeal of this graph model comes from its closely associated intersection graph, which exhibits nontrivial clustering properties and inhomogeneous vertex degrees. Using scaling limit techniques, we gain some insight into how a large bipartite graph from this class looks like at the criticality threshold of connectivity. We show that the geometry of the large connected components in these graphs will depend on the tail behaviours of the weight distributions of both parts.
- 5th March 2025 - Li Su (University of Cambridge) - Sensitivity analysis with balancing weights estimators to address informative visit times in irregular longitudinal data
Abstract: Irregular longitudinal data with informative visit times arise when patients’ visits are partly driven by concurrent disease outcomes. However, existing methods such as inverse intensity weighting (IIW), often overlook or have not adequately assessed the influence of informative visit times on estimation and inference. Based on novel balancing weights estimators, we propose a new sensitivity analysis approach to addressing informative visit times within the IIW framework. The balancing weights are obtained by balancing observed history variable distributions over time and including a selection function with specified sensitivity parameters to characterize the additional influence of the concurrent outcome on the visit process. A calibration procedure is proposed to anchor the range of the sensitivity parameters to the amount of variation in the visit process that could be additionally explained by the concurrent outcome given the observed history and time. Simulations demonstrate that our balancing weights estimators outperform existing weighted estimators for robustness and efficiency. We provide an R Markdown tutorial of the proposed methods and apply them to analyse data from a clinic-based cohort of psoriatic arthritis.
- 12th March 2025 - John Aston (University of Cambridge)
Content placeholder
- 19th March 2025 - Uri Shalit
Content placeholder
- 26th March 2025 - Desi Ivanova
Content placeholder
- 2nd April 2025 - Hui Guo (The University of Manchester) - Fast machine learning causal network analysis using genetic instruments
Abstract: Mendelian randomization (MR) is a popular method of utilising genetic variants as instruments to investigate causal relationships between risk factors and disease outcomes in observational studies. MR sidesteps the issue of confounding (both observed and unobserved) by mimicking randomized controlled trials. Most MR methods require parametric models. When sample size and/or the number of variables increases, it is likely that this approach becomes computationally intractable. Network analysis, mainly employing machine learning algorithms (e.g., Bayesian networks), allows for inclusion of many risk factors and disease outcomes simultaneously in a single model, with the aim to identify direct and indirect effects of the risk factors on outcomes. This approach does not require a pre-specified parametric model, making it an effective way of exploring data structure by discarding redundant variables (e.g., PC algorithm) after testing for marginal and conditional independence properties between each pair of variables. However, it often assumes this is no unobserved confounding. I will discuss how one can take forward strengths of both MR and machine learning for fast causal network analysis.
- 9th April 2025 - Cris Salvi
Content placeholder
- 16th April 2025 - Kamelia Daudel
Content placeholder
- 30th April 2025 - Panagiotis Toulis
Content placeholder
- 14th May 2025 - Jordan Richards
Content placeholder