## Statistical Science Seminars

**Usual time**: Thursdays 16:00 - 17:00

**Location**: Room 102, Department of Statistical Science, 1-19 Torrington Place (1st floor).

Some seminars are held in different locations at different times. Click on the abstract for more details.

## 09 February: Ioannis Kosmidis (University College London)

### Beyond Beta regression: modelling bounded-domain variables in the presence of boundary observations

Beta regression is a useful tool for modelling bounded-domain continuous response variables, such as proportions, rates fractions and concentration indices. One important limitation of beta regression models is that they do not apply when at least one of the observed responses is on the boundary --- in such scenarios the likelihood function is simply 0 regardless of the value of the parameters. The relevant approaches in the literature focus on either the transformation of the observations by small constants so that the transformed responses end up in the support of the beta distribution, or the use of a discrete-continuous mixture of a beta distribution and point masses at either or both of the boundaries. The former approach suffers from the arbitrariness of choosing the additive adjustment. The latter approach gives a "special" interpretation to the boundary observations relative to the non-boundary ones, and requires the specification of an appropriate regression structure for the hurdle part of the overall model, generally leading to complicated models. In this talk we rethink of the problem and present an alternative model class that leverages the flexibility of the beta distribution, can naturally accommodate boundary observations and preserves the parsimony of beta regression, which is a limiting case. Likelihood-based learning and inferential procedures for the new model are presented, and its usefulness is illustrated by applications.

## 16 February: Simon Lunagomez (University College London)

### Geometric
Representations of Random Hypergraphs*
*

*Joint with Edoardo M. Airoldi, Sayan Mukherjee and Robert L.
Wolpert*

We introduce a novel parametrization of distributions on hypergraphs based on the geometry of points in Rd. The idea is to induce distributions on hypergraphs by placing priors on point configurations via spatial processes. This prior specification is then used to infer conditional independence models or Markov structure for multivariate distributions. This approach supports inference of factorizations that cannot be retrieved by a graph alone, leads to new Metropolis-Hastings Markov chain Monte Carlo algorithms with both local and global moves in graph space, and generally offers greater control on the distribution of graph features than currently possible. We provide a comparative performance evaluation against state-of-the-art, and we illustrate the utility of this approach on simulated and real data.

## 23 February: Alexandra Lewin (Brunel University London)

### Bayesian inference on high-dimensional Seemingly Unrelated Regressions

We present a Bayesian Seemingly Unrelated Regressions (SUR) model for associating metabolomics outcomes with genetic variants, allowing for both sparse variable selection and correlation between the outcomes. Previously people have made use of either the assumption of independence between the outcomes (Bottolo et al. 2011, Lewin et al. 2015) or selected predictors jointly for all the outcomes (Bhadra and Mallik 2013, Bottolo et al. 2013).

In order to overcome some of the computational difficulty with the general SUR model, Zellner and Ando (2010) proposed a reparametrisation of the model in which the likelihood factorises completely into a product of conditional distributions, and used a Direct Monte Carlo (DMC) approach to estimate the posterior. We extend their work by allowing for a more general prior distribution, and we show that it is possible to build a Gibbs-DMC sampler without the need for re-sampling. The proposed method is applied to both simulated data, to illustrate the computational gains, and real metabolomics analysis where the dimension of the data precludes the use of the traditional sampler.

## 02 March: Benjamin Guedj (Institut national de recherche en informatique et en automatique)

tba

## 09 March: Mihai Cucuringu (The Alan Turing Institute)

tba

## 16 March: Codina Cotar (University College London)

tba

## 23 March: Thomas Bartlett (University College London)

tba

## 30 March: Marta Blangiardo (Imperial College London)

tba

## 06 April: Andrew Titman (Lancaster University)

tba

## 20 April: Alessandra Cipriani (University of Bath)

tba