Statistical Science Seminars
Usual time: Thursdays 16:00 - 17:00
Location: Room 102, Department of Statistical Science, 1-19 Torrington Place (1st floor).
Some seminars are held in different locations at different times. Click on the abstract for more details.
Optimising pseudo-marginal random walk Metropolis algorithms
Pseudo-marginal MCMC algorithms provide a general recipe for circumventing the need for target density evaluation when calculating the Metropolis-Hastings acceptance probability. Remarkably, replacing the density with an unbiased stochastic estimator thereof still leads to a Markov chain with the desired stationary distribution. We examine the pseudo-marginal random walk Metropolis algorithm and its overall efficiency in terms of both speed of mixing and computational time. Under a frequently encountered regime we identify the optimal acceptance rate and variance of the stochastic estimator. We also provide guidance for more general regimes and close with a surprising conjecture: that in certain regimes choosing the estimator with the largest noise can be optimal.
High-Dimensional Incremental Divisive Clustering under Population Drift
Clustering is a central problem in data mining and statistical pattern recognition with a long and rich history. The advent of Big Data has introduced important challenges to existing clustering methods in the form of high-dimensional, high-frequency, time-varying streams of data.
Up-to-date research on Big Data clustering has been almost exclusively focused on addressing individual aspects of the problem in isolation, largely ignoring whether and how the proposed methods can be extended to address the overall problem. We will discuss an incremental divisive clustering approach for high-dimensional data that has storage requirements that are low and more importantly independent of the stream size, and can identify changes in the population distribution that require a revision of the clustering result.
Stochastic Claims Reserving: Chain Ladder, Double Chain Ladder and Actuarial Practice
This seminar will consider the problem of setting reserves against future claims in general (non-life) insurance. It will focus on the uncertainty of these estimates and the implications for capital setting and solvency requirements. The seminar will consider the approaches taken in practice and show how relatively simple statistical modelling can be used. The advantage of simple approaches is that they are likely to be understood and used, and they can be used relatively widely and consistently. The disadvantages will also be examined and a new approach set out, which aims to retain as much simplicity as possible while addressed some of the inadequacies of the commonly-used teachniques.
Fused Community Detection
Community detection is one of the most widely studied problems in network research. In an undirected graph, communities are regarded as tightly-knit groups of nodes with comparatively few connections between them. Popular existing techniques, such as spectral clustering and variants thereof, rely heavily on the edges being suffi- ciently dense and the community structure being relatively obvious. These are often not satisfactory assumptions for large-scale real-world datasets. We therefore propose a new community detection method, called fused community detection (fcd), which is designed particularly for sparse networks and situations where the community struc- ture may be opaque. The spirit of fcd is to take advantage of the edge information, which we exploit by borrowing sparse recovery techniques from regression problems. Our method is supported by both theoretical results and numerical evidence. The algorithms are implemented in the R package fcd, which is available on cran. This is joint work with Dr. Yang Feng (Columbia University) and Prof. Richard Samworth (University of Cambridge).
Recent Results on the Eigenvalues of Random Matrices with Application to Wireless Communications and to MANOVA
The increasing demand for wireless communications has recently generated interest in multiple-input-multiple-output (MIMO) systems, realized by multiple antennas. Such systems can provide great advantages due to the presence of multiple rays propagation, causing the elements of the channel gain matrix to randomly fluctuate. The channel gain matrix can be well modeled by a random matrix. In particular, the Shannon capacity of MIMO systems depends on the distribution of the eigenvalues of Hermitian matrices, whose dimensions are related to the number of transmitting and receiving antenna elements. In several practical situations, the elements of the channel matrix can be modeled as complex Gaussian random variables, and the wireless system performance is related to the distribution of the eigenvalues of Wishart matrices or complex Gaussian quadratic forms.
In this talk, we present recent results on the distribution of the eigenvalues of complex Wishart matrices and related quadratic forms, with applications to wireless MIMO systems and to spectrum sensing. The case of real Wishart and multivariate Beta matrices is also discussed, with new results on the distribution of the Roy's statistic for MANOVA.
EEG/MEG source reconstruction using 'LDA beamforming' and signal-space projection
In EEG/MEG research, beamforming has been used in conjunction with head models to estimate source activity stemming from regions-of-interest by inverting the linear model. I present two novel approaches to inverse modelling of sources without knowledge of a head model. First, linear discriminant analysis (LDA), mostly used for the classification of mental states, can also be applied to reconstruct the time course of discriminatory brain activity. The optimization problems in LDA and LCMV beamforming are shown to be equivalent. Second, multi-component signal-space projection (MSSP) allows for the recovery of several signals of interest and the explicit modelling of noise sources. Empirical results on the analysis of single-trial ERP latencies are shown. Concluding, LDA beamforming and MSSP are purely data-driven approaches that can complement LCMV beamforming and other classical source reconstruction approaches, particularly when a head model is not available or sensor positions have not been registered.
Classification of a mixture of Gaussians from noisy compressive measurements: fundamental limits, designs and geometrical interpretation
Compressive sensing (CS) is an emerging paradigm that offers the means to simultaneously sense and compress a signal without any loss of information. The sensing process is based on the projection of the signal of interest onto a set of vectors, which are typically constituted randomly, and the recovery process is based on the resolution of an inverse problem. The result that has captured the imagination of the signal and information processing community is that it is possible to perfectly reconstruct a n-dimensional s-sparse signal (sparse in some orthonormal dictionary or frame) with overwhelming probability with only O(slog(n/s)) linear random measurements or projections using tractable l1 minimization methods or iterative methods, like greedy matching pursuit.
The focus of compressive sensing has been primarily on exact or near-exact signal reconstruction from the set of linear signal measurements. However, it is also natural to leverage the paradigm to perform other relevant information processing tasks, such as detection, classification and estimation of certain parameters, from the set of compressive measurements. One could in fact argue that the paradigm is a better fit to information processing tasks such as signal detection, signal classification or pattern recognition rather than signal reconstruction, since it may be easier to discriminate between signal classes than reconstruct an entire signal using only partial information about the source signal.
The focus of this talk is on the classification of a mixture of Gaussians from noisy compressive measurements. By leveraging analogies between the compressive classification problem and the multiple-antenna wireless communications problem, we argue that it is possible to construct performance characterizations that encapsulate not only the standard phase transition notion (that captures the presence of absence of a misclassification error probability floor) but also other more refined notions such as the diversity gain and the measurement gain. We also argue that it is possible to use the new characterizations as a proxy to design optimal projections/measurements for compressive classification problems: such measurements lead to considerable performance gains in relation to random ones. It is also shown that the performance characterizations and the designs are imbued with considerable geometrical significance.
The talk also shows how the fundamental limits associated with the classification of a mixture of Gaussians translate into fundamental limits associated with the reconstruction of the mixture: in particular, we provide a characterization of the number of measurements that is both necessary and sufficient to reconstruct the mixture that it is much sharper than standard characterizations in the literature. In addition, the talk illustrates how such results apply to image compression.
This represents joint work with Hugo Reboredo (University of Porto, Portugal), Francesco Renna (University of Porto, Portugal), Lawrence Carin (Duke University, USA) and Robert Calderbank (Duke University, USA).
Complementarity Models, Stochasticity and their Application to Energy Markets
This talk comprises an introduction to complementarity modeling and its application to energy markets. In the first part of the talk, complementarity problems are defined and their strong relation to optimization and equilibrium problems is pointed out. The second part of the talk contains two applications of complementarity models in energy markets: an electricity spot market equilibrium, and an example of plug-in electric vehicle (PEV) aggregators. Subsequently, uncertainty is introduced in these modeling concepts. In the final part of the talk, complementarity models and their relation to bilevel models – in particular stochastic MPECs – are discussed and a case study is presented.
Please note that this seminar will be held in lecture theatre B17 (1-19 Torrington Place), beginning at 16:00
Nested Markov Models: Capturing Constraints Implied by Unobserved Confounding
Many datasets are plagued by unobserved confounders: hidden but relevant variables. The presence of such hidden variables obscures many conditional independence constraints on the observed margin, and greatly complicates data analysis.
In this talk I introduce a new type of equality constraint which generalizes conditional independence, and which is a ``natural'' equality constraint for data generated from the marginal distribution of a DAG graphical model. I also introduce a new kind of graphical model, called the nested Markov model, which captures these constraints via a simple graphical criterion.
I discuss parameterizations for nested Markov models with discrete state spaces, together with parameter and structure learning algorithms. Finally, I provide some preliminary results on model equivalence. In particular I show cases where a single equality constraint is sufficient to completely recover a nested Markov model, and thus the underlying hidden variable DAG. I also discuss a log-linear parameterization which allows sparse modeling with nested Markov models, and illustrate the advantages of this parameterization with a simulation study.
This is joint work with Thomas S. Richardson, James M. Robins, and Robin J. Evans.
Spatial statistics with Markov properties
In spatial statistics, Gaussian Markov random fields on graphs and lattices have traditionally been viewed as a completely separate approach from continuous covariance functions, and also separate from many other methods developed to handle large data sets and non-stationary phenomena. In this talk, I will show some fundamental connections between several of these approaches, illustrate how Markov models based on stochastic partial differential equations (SPDEs) can be used, and discuss some current and future challenges.
Estimation in the Presence of Many Nuisance Parameters: Composite Likelihood and Plug-in Likelihood
We consider the incidental parameters problem in this paper, i.e. the estimation for a small number of parameters of interest in the presence of a large number nuisance parameters. By assuming that the observations are taken from a multiple strictly stationary process, the two estimation methods, namely the maximum composite quasi-likelihood estimation (MCQLE) and the maximum plug-in quasi-likelihood estimation (MPQLE) are considered. For the MCQLE, we profile out nuisance parameters based on lower-dimensional marginal likelihoods, while the MPQLE is based on some initial estimators for nuisance parameters. The asymptotic normality for both the MCQLE and the MPQLE is established under the assumption that the number of nuisance parameters and the number of observations go to infinity together, and both the estimators for the parameters of interest enjoy the standard root-$n$ convergence rate. Simulation with a spatial-temporal model illustrates the finite sample properties of the two estimation methods.
Page last modified on 08 oct 12 11:11