Prof. Nicolai Meinshausen

Maximin effects in inhomogeneous large-scale data

Abstract

Large-scale data are often characterised by some degree of inhomogeneity as data are either recorded in different time regimes or taken from multiple sources. We look at regression models and the effect of randomly changing coefficients, where the change is either smoothly in time or some other dimension or even without any such structure. Fitting varying-coefficient models or mixture models can be appropriate solutions but are computationally very demanding and often try to return more information than necessary. If we just ask for a model estimator that shows good predictive properties for all regimes of the data, then we are aiming for a simple linear model that is reliable for all possible subsets of the data. We propose a maximin effects estimator and look at its prediction accuracy from a theoretical point of view in a mixture model with known or unknown group structure. Under certain circumstances the estimator can be computed orders of magnitudes faster than standard penalised regression estimators, making computations on large-scale data feasible. We also show how a modified version of bagging can accurately estimate maximin effects.

Background

Prof. Meinshause is a Professor at the department of Statistics of ETH Zurich. In 2012 he was a Professor of Statistics at the University of Oxford and in 2007 a Post-doctoral fellow at the University of California Berkeley. He obtained his PhD from ETH Zurich and his Msc. in Applied Computational Mathematics from the University of Oxford. He was awarded the Guy Medal from the Royal Statistical Society in 2011 and he will give the IMS Medailon Lecture in 2015. Currently he is associate editor for the Journal of the Royal Statistical Society Series B and the Journal of Machine Learning Research.

Personal web page.

Research Interests

Computational Statistics, High-dimensional Data, Regularization, Lasso-type Estimators, Sparsity, Machine Learning, Multiple Testing, Visualizations, Statistics for Astronomy and Climate Science.

Posted in Speakers2015.