UCL Cancer Institute

Analysis Pipelines


Analysis Pipeline Logo

Project Leaders: Dr Tiffany Morris & Dr Emanuele Libertini



We utilize and develop computational tools for the analysis of biological data, primarily obtained from studies of DNA methylation. The data predominantly comes from either second generation sequencing platforms or methylation arrays.







DNA Sequence Analysis


Our focus is on the analysis of MeDIP-seq, though we are also experienced in the analysis of RNA-seq, ChIP-seq, bis-seq and exome sequencing.

Originally, the Batman algorithm (Down et al., 2008) was used for our MeDIP analyses, including the first cancer methylome (Feber et al., 2011). Since then we have developed the MeDUSA pipeline (Wilson et al., 2012). The focus of which is to locate differentially methylated regions between cohorts.

MeDUSA (Methylated DNA Utility for Sequence Analysis) brings together numerous software packages to perform a full analysis of MeDIP-seq data, including sequence alignment, quality control (QC), and determination and annotation of DMRs. MeDUSA utilizes several applications from within the USeq software suite, and in turn uses the R Bioconductor package DESeq for differential count analysis. In addition, MeDUSA will control several other important functions from the alignment (BWA) and subsequent filtering (SAMtools), through generation of numerous quality control metrics (FastQC and MEDIPS), and finally preliminary annotation of the DMRs (utilizing the capabilities of BEDTools). MeDUSA can be downloaded from HERE.

A focus for future research within the group is on the integration of disparate datasets in order to elucidate a fuller understanding of the underlying biology and thus address fundamental questions associated with epigenetic regulation of mammalian cells.

medusa



Illumina Infinium 450k Array Analysis


Another focus in our group is on analysis of Illumina's Infinium HumanMethylation450 BeadChip. This platform was designed with two different probe types. Technical differences have been shown to exist between the two probe types and normalisation methods that adjust for this Type 2 bias have been developed (Dedeurwaerder et al., 2011, Makismovic et al., 2012, Teschendorff et al., 2013). In addition, careful study design is important to avoid issues with batch effects.

Our group has assembled ChAMP -Chip Analysis Methylation Pipeline (Morris et al., 2013) that incorporates available tools for data upload, quality control and normalisation. In addition, it offers novel methods including: SVD (Single Value Decomposition) for visualising the largest components of biological/technical variation to enable the identification of batch effects (Teschendorff et al, 2011); Probe Lasso DMR Hunter which is based on a feature-oriented dynamic window ("lasso") that aims to capture neighbouring significant probes and bundle them into DMRs (Butcher unpublished); and also a CNA analysis module which takes raw intensity data, normalises and corrects for batch effects related to BeadChip, smooths and segments the data and returns a data frame of segments which can be filtered to find significant gains and losses (Feber et. al, 2014).


Illumina Infinium 450k Analysis Workshop London


To aid in the discussion of analysis issues and ideas related to the Illumina 450k array we have hosted two workshops in London. In 2012, the workshop was held at UCL and the focus was on normalising for Type 2 bias, (Morris et al., 2012). In 2013, we hosted a second workshop held at QMUL. In this workshop the topics had gone past normalisation and focused on established pipelines and options for downstream analysis (Lowe et al., 2013). Registration for the 3rd workshop on 8-9 May 2014 is now open. Registration for that and information on previous workshops including links to slides from the first workshop and videos from the second workshop are available at http://450kworkshop.info.



References


1. Down TA, Rakyan VK, Turner DJ, Flicek P, Li H, Kulesha E, Graf S, Johnson N, Herrero J, Tomazou EM, et al: A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat Biotechnol 2008, 26:779-785.
2. Feber A, Wilson GA, Zhang L, Presneau N, Idowu B, Down TA, Rakyan VK, Noon LA, Lloyd AC, Stupka E, et al: Comparative methylome analysis of benign and malignant peripheral nerve sheath tumors. Genome Res 2011, 21:515-524.
3. Wilson GA, Dhami P, Feber A, Cortazar D, Suzuki Y, Schulz R, Scchar P, Beck S: Resources for methylome analysis suitable for gene knockout studies of potential epigenome modifiers. Giga Science 2012, 1:3.
4. Sarah Dedeurwaerder, Matthieu Defrance, Emilie Calonne, Helene Denis, Christos Sotiriou, and Francois Fuks: Evaluation of the Infinium Methylation 450K technology Epigenomics 2011, 3:771-784.
5. Jovana Makismovic, Lavinia Gordon, and Alicia Oshlack: Subset quantile within-array normalization for Illumina Infinium HumanMethylation450 BeadChips, in review.
6. Teschendorff, A. E. et. a. (2013). A beta-mixture quantile normalization method for correcting probe design bias in illumina infinium 450 k dna methylation data. Bioinformatics, 29(2), 189–96.
7. Teschendorff A E et al.: Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies. Bioinformatics 2011;27:1496-1505
8. Morris T, Butcher L, Feber A, Teschendorff A, Chakravarthy, AR, Wojdacz TK and Beck S (2013). CNA profiling using high density DNA methylation arrays, Genome Biology 2014.
9. Feber A. et. al. (2014). CNA profiling using high density DNA methylation arrays, Genome Biology 2014.
10. Morris T and Lowe R: Report on the Infinium 450k methylation array analysis workshop: April 20, 2012 UCL, London, UK. Epigenetics 2012 Aug;7(8).
11. Lowe R and Morris T: Report on the 2nd Annual Infinium HumanMethylation450 Array Workshop 15 April 2013 QMUL London UK. Epigenetics 2013 Aug 15; 8(10).




 

Project Leaders

Tiffany Morris

Tiffany Morris, PhD
Medical Genomics
UCL Cancer Institute
University College London
Paul OíGorman Building
72 Huntley Street
London WC1E 6BT, UK
Tel: +44-20-7679-0999
tiffany.morris@ucl.ac.uk



elibertini

Emanuele Libertini, PhD
Medical Genomics
UCL Cancer Institute
University College London
Paul OíGorman Building
72 Huntley Street
London WC1E 6BT, UK
Tel: +44-20-7679-0999
emanuele.libertini@ucl.ac.uk