Statistical Science


Statistical Science Seminars

Usual time: Thursdays 14:00-15:00

Location: Room 102, Department of Statistical Science, 1-19 Torrington Place (1st floor). Some seminars are held at different locations and at different times.  Please click on the abstract for further details.


10 January 2019: Dr. Zahra Abdulla (King's College London)

How to bring the fun back to Statistics teaching: Inclusive practices to combat statistical anxiety

One of the major challenges for teachers of Statistics to non-statisticians is the high levels of statistical anxiety amongst students, students' perceptions of what their experience has been to learn statistics or mathematics in the past and the potential of the negative impact of these attitudes or beliefs on how students learn statistics. 

This talk will aim to showcase how to use different types inclusive practice activities and assessment methods constructively aligned with the learning outcomes, to support in developing students’ confidence in the classroom; through providing a supportive learning environment that works through building trust, setting expectations and making statistics fun.

24 January 2019: Dr. Martin Lopez-Garcia (University of Leeds)

A unified stochastic modelling framework for the spread of nosocmial infections

Over the last years, a number of stochastic models have been proposed for analysing the spread of nosocomial infections in hospital settings. These models often account for a number of factors governing the spread dynamics: spontaneous patient colonization, patient-staff contamination/colonization, environmental contamination, patient cohorting, or health-care workers (HCWs) hand-washing compliance levels. For each model, tailor-designed methods are implemented in order to analyse the dynamics of the nosocomial outbreak, usually by means of studying quantities of interest such as the reproduction number of each agent in the hospital ward, which is usually computed by means of stochastic simulations or deterministic approximations. In this work, we propose a highly versatile stochastic modelling framework that can account for all these factors simultaneously, and analyse the reproduction number of each agent at the hospital ward during a nosocomial outbreak, in an exact and analytical way. By means of five representative case studies, we show how this unified modelling framework comprehends, as particular cases, many of the existing models in the literature. We implement various numerical studies via which we: i) highlight the importance of maintaining high hand-hygiene compliance levels by HCWs, ii) support infection control strategies including to improve environmental cleaning during an outbreak, and iii) show the potential of some HCWs to act as super-spreaders during nosocomial outbreaks.


31 January 2019: Dr. Alex Lewin (London School of Hygiene and Tropical Medicine)

Sparse variable and covariance selection for high-dimensional seemingly unrelated Bayesian regression

High-throughput technology for molecular biomarkers is increasingly producing multivariate phenotype data exhibiting strong correlation structures. Existing approaches for combining such data with genetic variants for multivariate Quantitative Trait Loci analysis generally either ignore genetic markers and phenotypes correlations or make other restrictive assumptions about the associations between phenotypes and genetic loci. Here we present a Bayesian Variable Selection (BVS) model with sparse variable and covariance selection for high-dimensional Seemingly Unrelated Regressions (SUR). The model allows different phenotype responses to be associated with different genetic predictors (a seemingly unrelated regressions framework). A general and sparse covariance structure is allowed for the residuals relating to the conditional dependencies between phenotype variables, with a graphical modelling prior. The graphical structure amongst the multivariate responses can be estimated as part of the model. To achieve feasible computation of the large model space, we exploit a factorisation of the covariance matrix parameter to enable faster computation using Markov Chain Monte Carlo (MCMC) methods. We are able to infer associations with thousands of predictors on hundreds of responses. We illustrate the model using a dataset of 158 NMR spectroscopy measured metabolites and over 9000 Single Nucleotide Polymorphisms, measured in a cohort of more than 5000 people.


14 February 2019: Dr. Tengyao Wang (University College London)

Isotonic regression in general dimensions

We study the least squares regression function estimator over the class of real-valued functions on [0,1]^d that are increasing in each coordinate.  For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order n^{-min(2/(d+2),1/d)} in the empirical L_2 loss, up to poly-logarithmic factors.  Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise constant on k hyperrectangles, the least squares estimator enjoys a faster, adaptive rate of convergence of (k/n)^{min(1,2/d)}, again up to poly-logarithmic factors.  Previous results are confined to the case d 2.  Finally, we establish corresponding bounds (which are new even in the case d=2) in the more challenging random design setting.  There are two surprising features of these results: first, they demonstrate that it is possible for a global empirical risk minimisation procedure to be rate optimal up to poly-logarithmic factors even when the corresponding entropy integral for the function class diverges rapidly; second, they indicate that the adaptation rate for shape-constrained estimators can be strictly worse than the parametric rate.

07 March 2019: Dr. Silvia Liverani (Queen Mary University of London)

Model selection for Dirichlet process mixture models

Dirichlet process mixture models are popular for model-based clustering, and they are used in numerous fields (machine learning,  epidemiology, genetics, and so on). In this talk I will focus on profile regression, a Dirichlet process mixture model which links a response model to a covariate profile. Then I will discuss model selection for Bayesian nonparametric models when the objective is clustering. The extraction of a unique partition from the partitions sampled in their posterior distributions is a challenging task. Numerous methods are proposed, but in practice, they lead to partitions which may turn out to be very different, making the interpretation difficult.

14 March 2019 15:30: Prof. Jinyuan Chang (Southwestern University of Finance and Economics, China)

A new scope of penalized empirical likelihood with high-dimensional estimating equations

Statistical methods with empirical likelihood (EL) are appealing and effective especially in conjunction with estimating equations for flexibly and adaptively incorporating data information. It is known that EL approaches encounter difficulties when dealing with high-dimensional problems. To overcome the challenges, we begin our study with investigating high-dimensional EL from a new scope targeting at high-dimensional sparse model parameters. We show that the new scope provides an opportunity for relaxing the stringent requirement on the dimensionality of the model parameters. Motivated by the new scope, we then propose a new penalized EL by applying two penalty functions respectively regularizing the model parameters and the associated Lagrange multiplier in the optimizations of EL. By penalizing the Lagrange multiplier to encourage its sparsity, a drastic dimension reduction in the number of estimating equations can be achieved. Most attractively, such a reduction in dimensionality of estimating equations can be viewed as a selection among those high-dimensional estimating equations, resulting in a highly parsimonious and effective device for estimating high-dimensional sparse model parameters. Allowing both the dimensionalities of model parameters and estimating equations growing exponentially with the sample size, our theory demonstrates that our new penalized EL estimator is sparse and consistent with asymptotically normally distributed nonzero components. Numerical simulations and a real data analysis show that the proposed penalized EL works promisingly.

21 March 2019: Dr. Rebecca Turner (University College London)

Incorporating external evidence on between-trial heterogeneity in network meta analysis

In a network meta-analysis, the results from studies evaluating multiple different treatment comparisons are modelled simultaneously, and summary findings for each comparison are based on a combination of direct and indirect evidence. Between-study heterogeneity variances for individual comparisons are often very imprecisely estimated because data are sparse. External evidence can provide relevant empirical distributions, which can be used as informative prior distributions for heterogeneity. We explore approaches for specifying informative prior distributions for multiple heterogeneity variances in a network meta-analysis.


Affiliated Seminars