XClose

Institute for Mathematical and Statistical Sciences

Home
Menu

Frontiers in Statistical Science - IMSS Annual Lecture

06 June 2024, 12:00 pm–5:00 pm

An image of the UCL Portico and main entrance taken from Gower Street

Join the Institute for Mathematical and Statistical Sciences (IMSS) as we welcome Prof Andrea Rotnitzky (University of Washington) for our first annual lecture, organised by the Department of Statistical Science.

This event is free.

Event Information

Open to

All

Availability

Yes

Cost

Free

Organiser

UCL IMSS

Location

Denys Holland Lecture Theatre
4-8 Endsleigh Gardens
London
WC1H 0EG

The Institute of Mathematical and Statistical Sciences (IMSS) is delighted to host Prof Andrea Rotnitzky (University of Washington), who will deliver the first IMSS annual lecture “Frontiers in Statistical Science” on the theme of 'Causal Inference'.

Schedule:

13.00-13.30: Registration
13:30-15:15: Three 30 minute talks plus 5 minute Q&A
15:15-15:45: Coffee Break
15:45-17:00: Keynote
17:00: Reception

Also joining us are Alicia Curth (University of Cambridge), Qingyuan Zhao (University of Cambridge) and Robin Evans (University of Oxford) who will each deliver a talk - details below:

Alicia Curth (University of Cambridge)

Title: “Would this work for Time-to-Event Data Too?” Cautionary Tales on Extending Modern Approaches for Treatment Effect Estimation to Survival Analyses

Abstract: The toolbox for estimating treatment effects is expanding rapidly, with methods like synthetic controls recently finding immense popularity in economics and questions surrounding the estimation of individualized treatment effects gaining traction in machine learning. Importantly, most of the literature on such modern approaches to treatment effect estimation focusses on standard continuous outcomes and targets differences in means only. Ubiquitous in medical applications, however, are time-to-event outcomes, which lead to the need to perform more intricate survival analyses instead. In this talk, we will explore some of the ways in which survival data can complicate the applicability of modern treatment effect estimation methods developed in the context of standard outcomes. First, we will investigate whether synthetic control methods could serve as an alternative to matching in survival analyses, and encounter obstacles created by differences in assumptions on data-generating mechanisms and target parameters of interest. Then, we will move to heterogeneous treatment effect estimation for time-to-event outcomes, and investigate conceptual and methodological challenges posed by the presence of competing events. 

This talk is based on the following papers:
[1] Curth, A., Poon, H., Nori, A. V., & González, J. (2024). Cautionary Tales on Synthetic Controls in Survival Analyses. Conference on Causal Learning and Reasoning (CLeaR).
[2] Curth, A., & van der Schaar, M. (2023). Understanding the impact of competing events on heterogeneous treatment effect estimation from time-to-event data. In International Conference on Artificial Intelligence and Statistics (AISTATS). 

Qingyuan Zhao (University of Cambridge)

Title: Confounder selection via iterative graph expansion

Abstract: Confounder selection, namely choosing a set of covariates to control for confounding between a treatment and an outcome, is arguably the most important step in the design of observational studies. Previous methods, such as Pearl's celebrated back-door criterion, typically require pre-specifying a causal graph, which can often be difficult in practice. We propose an interactive procedure for confounder selection that does not require pre-specifying the graph or the set of observed variables. This procedure iteratively expands the causal graph by finding what we call "primary adjustment sets" for a pair of possibly confounded variables. This can be viewed as inverting a sequence of latent projections of the underlying causal graph. Structural information in the form of primary adjustment sets is elicited from the user, bit by bit, until either a set of covariates are found to control for confounding or it can be determined that no such set exists. Other information, such as the causal relations between confounders, is not required by the procedure. We show that if the user correctly specifies the primary adjustment sets in every step, our procedure is both sound and complete. 

This is joint work with F Richard Guo.

Robin Evans, University of Oxford

Title: Parameterizing and Simulating from Causal Models

Abstract: Many statistical problems in causal inference involve a probability distribution other than the one from which data are actually observed; as an additional complication, the object of interest is often a marginal quantity of this other probability distribution. This creates many practical complications for statistical inference, even where the problem is non-parametrically identified. In particular, it is difficult to perform likelihood-based inference, or even to simulate from the model in a general way.

We introduce the frugal parameterization, which places the causal effect of interest at its centre, and then builds the rest of the model around it. We do this in a way that provides a recipe for constructing a regular, non-redundant parameterization using causal quantities of interest. In the case of discrete variables we can use odds ratios to complete the parameterization, while in the continuous case copulas are the natural choice.  Our methods allow us to construct and simulate from models with parametrically specified causal distributions, and fit them using likelihood-based methods, including fully Bayesian approaches. Our proposal includes parameterizations for the average causal effect and effect of treatment on the treated, as well as other common quantities of interest. 

More recently we have been working on expanding the scope of the parameterization using machine learning methods, enabling us to obtain much more realistic datasets, including missingness and other biases; if there is time these will be introduced.  I will also discuss some other applications of the frugal parameterization, including to survival analysis, generative modelling, parameterizing nested Markov models, and ‘Many Data’: combining randomized and observational datasets in a single parametric model. 

This is joint work with Vanessa Didelez (University of Bremen and BIPS), Xi Lin and Daniel Manela (both Oxford).

Keynote - Prof Andrea Rotnitzky

Title: Towards a unified theory of inference in semiparametric models for individual-level data fusion problems
Abstract: We tackle the challenge of conducting inference about a smooth finite-dimensional parameter by harnessing individual-level data from various sources. Recent advancements have given rise to a comprehensive framework adept at scenarios where different data sources align with different variation-independent components of the distribution of the target population. While this framework proves effective in significant contexts, such as transporting point exposure treatment effects to an external target population or in off-target policy evaluation under a covariate shift, it falls short in certain common data fusion problems such as in two-sample instrumental variable analysis, integrating data from two sources, one in which only the outcome and the instrument are available and a second in which only the treatment and the instrument. Moreover, it lacks the capability to seamlessly integrate data from epidemiological studies with diverse designs, such as prospective cohorts and retrospective case-control studies.
In this talk, I will introduce a novel unified theory that extends the recently developed framework by enabling the fusion of individual-level data from sources aligned with non-variation independent components of the target population likelihood. It furnishes universal procedures for characterizing the class of all influence functions of regular asymptotically linear estimators and the efficient influence function in any data fusion scenario, irrespective of the number of data sources, specific parameters of interest or the statistical model for the target population. This paves the way for machine-learning debiased, semiparametric efficient estimation. I will illustrate the application of the theory in several important cases, including the two previously described scenarios. 

This work is joint with Ellen Graham and Marco Carone.

About the Speaker

Prof Andrea Rotnitzky

Headshot of Prof Andrea Rotnitzky
Prof Andrea Rotnitzky is one of the 2022 Rousseeuw Prize for Statistics recipients. Her work falls into the general area of causal inference and semiparametric efficient methods, including modern flexible machine learning methods for causal inference and efficient causal effect estimation in causal graphical models.