Statistical models for periodic data in clinical and epidemiological child health studies

Supervisors: Dr Mario Cortina-Borja, Dr Marco Geraci, and Dr Angie Wade

Hypothesis: Methods for analysing seasonal patterns have an important role in paediatric research and practice which needs to be developed.

Aims and methods: Statistical, periodic (circular, seasonal, angular) models refer to variables whose values can be expressed as angles, i.e. as measures on the unit circle (1). This means that any cyclical, periodic variables, e.g. dates or hours of the day, the main variables used to investigate seasonality, are treated as angles in the circle or in other geometric spaces, e.g. the sphere or the torus, for multivariate observations. As cyclical patterns are dominated by the mathematical properties of periodic functions, analytical methods required to summarise, visualise and model such data differ substantially from those used in linear (as opposed to circular) statistics.

There is a need to extend modern statistical methods to the domain of circular statistics. For instance, there are no clear methodological guidelines for dealing with missing values, longitudinal designs, transformations, latent class mixtures, multivariate responses and regression diagnostics in a circular framework. This project will explore how to achieve this in the context of clinical and epidemiological child health studies. This research will have wide impact on children's health by facilitating the use of more appropriate and efficient models to analyse clinical and population-based data in which an angular component is of interest. This will apply both to existing datasets and those collected in future.

Angular measurements often occur in post-surgical assessment of paediatric patients, for instance, in the analysis of EEG power spectra in children with intracranial bleeding during invasive monitoring for epilepsy surgery. In epidemiology, establishing seasonal patterns that emerge only when adequate statistical models are used should lead to improvements in characterising the aetiology of complex diseases, designing better treatments and improving our understanding of life course. Seasonal patterns of presentation of a condition are often related to environmental factors: e.g. climatic (temperature, daily hours of sunlight, atmospheric pressure), location (latitude, altitude above sea level) and social (holidays, social class). Examples are the increase in incidence of respiratory diseases in winter, the higher (lower) birth rate in late September (late December), and the higher incidence of sudden infant death syndrome in winter. Sometimes an underlying seasonal pattern plays an important role in the foetal origins of adult disease (2). These relations are not well understood, but there is increasing evidence of such effects in future health. Examples are differences in expected lifespan (larger if born in autumn conditional on exceeding 50 years of age), and rates of schizophrenia (higher for those born in winter; in the tropics linked to rainy season) and suicide (higher among those born in spring). Analysing the effects of biological mechanisms essentially governed by the seasons on health affects our perception of diseases and might eventually offer new strategies for treatment and prevention. There is also an important dimension of seasonality concerning hour, day and date of birth and their effect on neonatal mortality and morbidity (3). Another aspect of the project concerns the study of empirical birthday distributions from many countries. Finally, we will look with detail at circular patterns of physical activity using data from the Physical Activity survey of the Millennium Cohort Study (4).

The successful candidate will be based within the Centre for Paediatric Epidemiology and Biostatistics and will contribute to disseminate the applications of circular statistical methods among child health researchers and practitioners. S/he will develop three libraries of statistical procedures within the R environment for statistical computing. One will extend the gamlss class of statistical models (5) to fit a range of univariate circular probability distributions, e.g. von Mises, Kato-Jones, cardioid, wrapped Normal, wrapped Laplace and wrapped stable-type distributions. This library will also allow fitting finite mixtures models, truncated and censored data, and develop goodness of fit procedures and model diagnostics. Another library will concern with fitting circular regression model for bivariate circular data using copula functions.

The project will also bridge population sciences and clinical science, as conclusions from circular statistical models fitted may improve the understanding of aetiological aspects of disease and could offer new strategies for treatment and prevention as well as inform public health policy.

References:
1) Barnett AG; Dobson AJ (2010) Analysing Seasonal Health Data, New York: Springer
2) Doblhammer G (2004) The late life legacy of very early life. New York: Springer
3) Pasupathy D; Wood AM; Pell JP et al (2010) Time of birth and risk of neonatal death at term: retrospective cohort study. BMJ, 341.
4) Rich C; Geraci M; Griffiths L; Sera F; Dezateux C; Cortina-Borja M (in press) Quality control methods in accelerometer data processing: defining minimum wear time. PLoS One (accepted May 2013)
5) Rigby RA: Stasinopoulos DM (2005) Generalized Additive Models for Location, Scale and Shape (with discussion), Journal of the Royal Statistical Society, Series C, 54, 507-554