Orengo Group


Giorgos Baskozos

Bioinformatics Approaches To Identify Pain Associated Genes and Non-Coding RNAs


During the current project we work closely with experimental scientists of the London Pain Consortium to create a bioinformatics pipeline to analyse next-generation sequencing datasets generated from animal tissue under neuropathic pain models. We analyse RNA-seq data in order to detect DE genes, splicing variants and non-coding RNAs which occur in pain states and may contribute to neuropathic pain. We will also develop novel protocols for statistical modelling and analysis of next generation sequencing data.


Profiling technologies have been extensively used in the context of pain research to identify differentially expressed (DE) genes under well induced pain states, usually using animal models. In addition advancements in next generation sequencing and particularly RNA-sequencing have given us the ability to completely reconstruct the whole transcriptome of animal models under specific pain models. Thus we are now able to detect alterations in expression for both annotated and un-annotated genomic regions and identify non-coding RNAs and novel exons, which may contribute to neuropathic pain.

Moreover deep sequencing could allow us to identify splicing variants which occur in pain states. Our primary aim is to identify novel pain-related genes, novel exons, Long Intergenic Non-coding RNAs and antisense RNAs which contribute to neuropathic pain and to analyse their biological pathways using functional genomic approaches. This project involves analysing next-generation sequencing data from rats underwent the Spinal Nerve Transection neuropathic pain model. We have developed a pipeline mainly using the statistical programming environment R. We are currently applying this pipeline to analyse datasets derived from the LPC labs of Professors David Bennett (Oxford University) and Stephen MacMahon (King's College London). Subsequently we proceed to network analysis in order to reveal genes, functional modules and non-coding RNAs which occur to neuropathic pain. Then we are going to integrate this functional genomics' data with PainNetworks (http://www.painnetworks.org), developed by the Orengo Group, to make the data publicly available.

Our previous analyses of RNA-seq data have led us to the conclusion that statistical modelling of RNA-seq data is still an open field and that there are still statistical challenges remaining in the analysis of RNA-seq data. Most DE analysis methods use some sensible but not very realistic assumptions regarding the distribution of RNA-seq data. However, the biological variability of gene expression is much more complex than this. Furthermore most of the DE analysis approaches use classical hypothesis testing. On the other hand Bayesian analysis is inherently non-parametric, as it does not assume the underlying data's distribution rather it tries to identify it, and gives a much more natural interpretation of the actual experiment. Thus we will develop a Bayesian non-parametric approach to analyse high-throughput count-based data.

Research Area: Functional Genomics