A A A

Statistical Science Seminars

Usual time: Thursdays 16:00 - 17:00 

Location: Room 102, Department of Statistical Science, 1-19 Torrington Place (1st floor).

Some seminars are held in different locations at different times.  Click on the abstract for more details.

18 April: Zoubin Ghahramani (University of Cambridge)

Bayesian nonparametric modelling of networks

Network data and more generally relational data encoding the pairwise relations between objects appear in many fields.  For instance in biology, a protein network connects interacting partners, while in a social network, links among people indicate social relationships. The problems of analysing, understanding and modelling such networks have attracted interest from many research communities.  I will briefly review some probabilistic approaches to modelling networks.  The key idea behind many such models is that each object has certain latent features, and that observed links in the network depend on these latent features. Probabilistic inference allows one to discover the potentially unbounded number of latent features (including discovering communities as a special case), predict missing links, and generally learn about the statistical properties of the networks. Many of these models can be cast within the theoretical framework of exchangeable arrays established by Aldous, Hoover and Kallenberg. I will describe our work on a general network model (the Random Function Model) that instantiates this theory using Gaussian processes, and relate it to existing models. I will also discuss our work on the Infinite Latent Attribute (ILA) model which allows for a highly structured nonparametric latent variable representation of nodes in a network. Finally, I will describe our Latent Feature Propagation model for dynamic networks. What ties these models together is the idea that rich latent representations underlie the structure of networks, and that these can be discovered via Bayesian inference.

Joint work with Creighton Heaukulani, David A. Knowles, James Lloyd, Peter Orbanz, Konstantina Palla, and Dan Roy.

2 May: Siegfried Hörmann (Université Libre de Bruxelles)

Dynamic functional principal components

Data in many fields of science are sampled from processes that can most naturally be described as functional. Examples include growth curves, temperature curves, curves of financial transaction data and patterns of pollution data. Functional data analysis (FDA) is concerned with the statistical analysis of such data. Since these are intrinsically infinite dimensional objects, tools for dimension reduction are desirable. The functional principal analysis (FPCA) takes here a leading role. It is a key tool in many important empirical and theoretical problems.

A problem with classical FPCA is that it operates in a static way and doesn't take into account any possible serial dependence of the functional observations.  Such dependence occurs quite frequently, e.g.\ if the data consist of a continuous time process which has been cut into segments (e.g.\ days). Though cross-sectionally uncorrelated for a fixed observation, the classical FPC-score vectors have non-diagonal cross-correlations. This means that we cannot analyse them componentwise (like in the i.i.d. case), but need to consider them as vector time series which are less easy to handle and interpret. In particular, a functional principal component with small eigenvalue, hence negligible instantaneous impact on some observation, may have a major impact on the lagged values. Classical static FPCs, thus, in a time series context, will not lead to an adequate dimension reduction technique, as they do in the i.i.d.\ case. This motivates the development of {\em dynamic functional principal components}. The idea is to transform the (possibly infinite dimensional) functional time series, into a vector time series (of low dimension 3 or 4, say), where the individual component processes are mutually uncorrelated, and explain a bigger part of the dynamics and variability of the original process.

In this talk we will propose such a dynamic version of FPCA and study its properties. An empirical analysis and a real data example will be given.

This talk is based on joint work with Łukasz Kidziński and Marc Hallin.

9 May: David Choi (Carnegie Mellon University)

Consistency of co-clustering exchangeable array data

We analyze the problem of partitioning a 0-1 array or bipartite graph into subgroups (also known as co-clustering), under a relatively mild assumption that the data is generated by a general nonparametric process. Our main application is the analysis of a simple clustering model for networks, the stochastic co-blockmodel, when the data is not assumed to be generated (even approximately) by the model. Our result suggests that the stochastic co-blockmodel and other community detection algorithms may be robust to model misspecification.  This is joint work with Patrick Wolfe (arXiv:1212.4093).

David Choi is an assistant professor at Carnegie Mellon University, in the Heinz college of public policy and information systems. His research focus is in theoretical statistics for social network data. David holds a PhD in electrical engineering from Stanford University.

16 May: Brendan Pass (University of Alberta)

Optimal transportation with infinitely many marginals

We formulate and study the problem of aligning a continuum of marginals as efficiently as possible. In our formulation, we look for the stochastic process with prescribed single time marginals which minimizes the expectation of a certain functional. This problem is a natural extension of a multi-marginal optimal transportation problem studied by Gangbo and Swiech (1998). In this talk, we prove existence, uniqueness and characterization results.

26 June: Dimitris Fouskakis (National Technical University of Athens)



Page last modified on 08 oct 12 11:11