Statistical Science


Professor Sofia Olhede awarded £366k EPSRC grant to study modelling and inference for massive populations of heterogeneous point processes

22 September 2015

Professor Sofia Olhede and her co-investigators, Dr David Murrell and Dr James Martin, have been awarded the EPSRC grant EP/N007336/1 under the standard research scheme.

Increasingly, handling large volumes of very heterogeneous data sets is necessary in most application domains. The past decade has seen considerable computational and theoretical developments as a consequence of this fact, enabling us to understand these new types of large volumes of data. This field of mathematics is known as "high dimensional data analysis" where typically the models we need to understand superficially are as complex as the observations they represent. Theory has been developed for many types of models in this setting. An outstanding challenge is understanding observations that come in the form of the spatial locations of a number of points, or events, which may belong to a number of distinct groups. Such data are referred to as "point processes", where the locations of objects of interest are exactly the points. Point processes are ubiquitous in applications, for example in ecology, seismology, and astronomy, and so new methods to understand such forms of data have a clear pathway to impact.

The challenge in the high dimensional setting for point processes is developing simple and flexible models that can be understood, and characterised, within realistic sampling scenarios. To enable the characterisation of observed data, the project will build new models through considering new forms of structure that the data can possess. To incorporate realistic features, we will build models with forms of scale-based heterogeneity, but also including more complex spatial structure. For many realistic processes this includes strong spatial forms of anisotropy, namely patterns associated with given spatial directions. This project will develop such models, and the methods necessary to characterise the structure from data. Computational feasibility will be a strong constraint, as the number of spatial patterns that we will analyse simultaneously will place a clear computational burden on the analysis.

The project will construct new methods to understand data collected in forest ecology. Here the data are locations of different tree species across time, and we consider a particularly high-dimensional, rich source of data that consists of over 275,000 individual trees, belonging to 312 different species. These data exhibit patterns of spatial aggregation and segregation associated with different spatial scales, but these patterns also show anisotropy associated with explanatory variables, which may be broadly classed as having a biotic or abiotic influence. Biotic factors, such as competition for the same nutrients, typically act independently of direction, whereas abiotic, or environmental factors, can have a rotationally asymmetric influence on plant dispersal. Abiotic features, such as features of the landscape like rivers, elevation and soil type, are normally found at large spatial scales relative to that of direct interaction between individuals. This means that whilst competition may lead to segregation of individuals at small scales, they may occur in the same areas of the landscape, giving an appearance of aggregation at larger scales.

The project will thus determine how to best model strongly heterogeneous multiscale structure in forest ecology and develop the mathematics necessary to quantify their form, which is not possible with current methodology. More broadly, this project will provide a flexible set of tools, and a mathematical framework to understand highly heterogeneous and anisotropic classes of point processes.

The total amount awarded was £365,667 and will fund a postdoctoral researcher for 36 months.