XClose

Statistical Science

Home
Menu

Statistical Science Seminars

A seminar series covering a broad range of applied and methodological topics in Statistical Science.

Talks now take place in a hybrid format: for all talks, you can attend in person or via Zoom. 

Some presentations will be given remotely. 

Usual time: Thursdays 14:00-15:00

Location: A lecture theater at UCL (Gower St. London WC1E 6BT, please check each week for exact venue), or Zoom (please email thomas dot bartlett dot 10 at ucl dot ac dot uk to join the mailing list, and receive the links to the talks).

Recent talks

Please subscribe to our Youtube channel, to view some recent talks from the series:

https://youtube.com/channel/UC6wQjF2n27k6a_TRO4GjPEg

Upcoming talks
 

30 September 2021: Martin Huber (Université de Fribourg) - Double machine learning for sample selection models

We consider the evaluation of discretely distributed treatments when outcomes are only observed for a subpopulation due to sample selection or outcome attrition. For identification, we combine a selection-on-observables assumption for treatment assignment with either selection-on-observables or instrumental variable assumptions concerning the outcome attrition/sample selection process. We also consider dynamic confounding, meaning that covariates that jointly affect sample selection and the outcome may (at least partly) be influenced by the treatment. To control in a data-driven way for a potentially high dimensional set of pre- and/or post-treatment covariates, we adapt the double machine learning framework for treatment evaluation to sample selection problems. We make use of (a) Neyman-orthogonal, doubly robust, and efficient score functions, which imply the robustness of treatment effect estimation to moderate regularization biases in the machine learning- based estimation of the outcome, treatment, or sample selection models and (b) sample splitting (or cross-fitting) to prevent overfitting bias. We demonstrate that the proposed estimators are asymptotically normal and root-n consistent under specific regularity conditions concerning the machine learners and investigate their finite sample properties in a simulation study. We also apply our proposed methodology to the Job Corps data for evaluating the effect of training on hourly wages which are only observed conditional on employment. The estimator is available in the causalweight package for the statistical software R. 

7 October 2021: Francesco Ravazzolo (Libera Università di Bolzano) - Dynamic Combination and Calibration for Climate Predictions

We propose a density calibration and combination model that dynamically calibrate and combine predictive distributions. The time-varying calibration and combination weights are fitted by an observation driven model with dynamics inferred by the score of the assumed conditional likelihood of the data generating process. The model is very flexible and can handle different shapes, instability and model uncertainty. We show this analytically and in simulation exercises. An empirical application to short-term wind speed predictions documents the large instability of individual model performance and their calibration properties, favouring our model in terms of predictive accuracy.  

14 October 2021: Almut Veraart (Imperial College London) - High-frequency estimation of the Levy-driven Graph Ornstein-Uhlenbeck process with applications to wind capacity factor measurements

We consider the Graph Ornstein-Uhlenbeck (GrOU) process observed on a non-uniform discrete time grid and introduce discretised maximum likelihood estimators with parameters specific to the whole graph or specific to each component, or node. Under a high frequency sampling scheme, we study the asymptotic behaviour of those estimators as the mesh size of the observation grid goes to zero. We prove two stable central limit theorems to the same distribution as in the continuously observed case under both finite and infinite jump activity for the Levy driving noise. When a graph structure is not explicitly available, the stable convergence allows to consider purpose-specific sparse inference procedures, i.e. pruning, on the edges themselves in parallel to the GrOU inference and preserve its asymptotic properties. We apply the new estimators to wind capacity factor measurements, i.e. the ratio between the wind power produced locally compared to its rated peak power, across fifty locations in Northern Spain and Portugal. We show the superiority of those estimators compared to the standard least squares estimator through a simulation study extending known univariate results across graph configurations, noise types and amplitudes.
This is joint work with Valentin Courgeau (Imperial College London)

21 October 2021: Chris Oates (University of Newcastle) - Robust Generalised Bayesian Inference for Intractable Likelihoods

Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible misspecification of the likelihood. Here we consider generalised Bayesian inference with a Stein discrepancy as a loss function, motivated by applications in which the likelihood contains an intractable normalisation constant. In this context, the Stein discrepancy circumvents evaluation of the normalisation constant and produces generalised posteriors that are either closed form or accessible using standard Markov chain Monte Carlo. On a theoretical level, we show consistency, asymptotic normality, and bias-robustness of the generalised posterior, highlighting how these properties are impacted by the choice of Stein discrepancy. Then, we provide numerical experiments on a range of intractable distributions, including applications to kernel-based exponential family models and non-Gaussian graphical models. 

29 October 2021: William Da Silva (Sorbonne Université Paris)

Title and abstract TBC.  

4 November 2021: Marta Catalano (University of Warwick) - A Wasserstein index of dependence for Bayesian nonparametric modeling

Optimal transport (OT) methods and Wasserstein distances are flourishing in many scientific fields as an effective means for comparing and connecting different random structures. In this talk we describe the first use of an OT distance between Lévy measures with infinite mass to solve a statistical problem. Complex phenomena often yield data from different but related sources, which are ideally suited to Bayesian modeling because of its inherent borrowing of information. In a nonparametric setting, this is regulated by the dependence between random measures: we derive a general Wasserstein index for a principled quantification of the dependence gaining insight into the models’ deep structure. It also allows for an informed prior elicitation and provides a fair ground for model comparison. Our analysis unravels many key properties of the OT distance between Lévy measures, whose interest goes beyond Bayesian statistics, spanning to the theory of partial differential equations and of Lévy processes.
 

11 November 2021: Pierre Jacob (ESSEC Business School, Paris)

Title and abstract TBC.  

18 November 2021: Matthias Fengler (Universität St.Gallen)

Title and abstract TBC.  

25 November 2021: Tom Rainforth (University of Oxford)

Title and abstract TBC.  

2 December 2021: Stefano Favaro (Università di Torino)

Title and abstract TBC.  

9 December 2021: Fabrizia Mealli (Università degli Studi di Firenze)

Title and abstract TBC.  

16 December 2021: Elena Stanghellini (Università degli Studi di Perugia)

Title and abstract TBC.  

13 January 2022: Emma Simpson (University College London) 

Title and abstract TBC.

20 January 2022: Perla Sousi (University of Cambridge) 

Title and abstract TBC.  

27 January 2022: Katie Harron (University College London) 

Title and abstract TBC.  

3 February 2022: Chris Holmes (University of Oxford) 

Title and abstract TBC.  

10 February 2022: Hakim Debhi (University College London)

Title and abstract TBC.  

17 February 2022: Kathryn Turnbull (Lancaster University) 

Title and abstract TBC.  

24 February 2022: TBC 

Title and abstract TBC.  

3 March 2022: Olatunji Johnson (University of Manchester) 

Title and abstract TBC

10 March 2022: Mike Daniels (University of Florida)

Title and abstract TBC

17 March 2022: Peter Orbanz (University College London)

Title and abstract TBC

21 March 2022: Georgia Salanti (Universität Bern)

Title and abstract TBC

31 March 2022: Judith Rousseau (University of Oxford)

Title and abstract TBC

7 April 2022: Rebecca Hubbard (University of Pennsylvania) - Considerations for valid analysis of medical product utilization and outcomes from real-world data

Opportunities to use “real-world data,” data generated as a by-product of digital transactions, have exploded over the past decade. In the context of health research, real-world data including electronic health records and medical claims facilitate understanding of treatment utilization and outcomes as they occur in routine clinical practice, and studies using these data sources can potentially proceed rapidly compared to trials and observational studies that rely on primary data collection. However, using data sources that were not collected for research purposes comes at a cost, and naïve use of such data without considering their complexity and imperfect quality can lead to bias and inferential error. Real-world data frequently violate the assumptions of standard statistical methods, but it is not practicable to develop new methods to address every possible complication arising in their analysis. The statistician is faced with a quandary: how to effectively utilize real-world data to advance research without compromising best practices for principled data analysis. In this talk I will use examples from my research on methods for the analysis of electronic health records (EHR) derived-data to illustrate approaches to understanding the data generating mechanism for real-world data. Drawing on this understanding, I will then discuss approaches to identify, use, and develop principled methods for incorporating EHR into research. The overarching goal of this presentation is to raise awareness of challenges associated with the analysis of real-world data and demonstrate how a principled approach can be grounded in an understanding of the scientific context and data generating process.

28 April 2022: TBC

Title and abstract TBC

5 May 2022: Ioannis Kosmidis (University of Warwick)

Title and abstract TBC

Affiliated Seminars