Lingjian Yang received his MSc in Chemical Process Engineering from Department of Chemical Engineering, UCL in 2011. His MSc research project focuses on optimising schedule of power generation. From Nov 2011, he has started his PhD on data mining using high-throughput genomic data.
Research project
Title: Disease classification using high-throughput genomic data
Microarray proling technology enables simultaneous examination of expression levels of thousands genes for each patient in a single chip, and a cohort study typically contains multiple patients. The millions of data points generated per cohort study carry rich information to study complex diseases, for example cancer and psoriasis, from gene level.
Classification techniques have been widely applied to identify the dependence between gene expression and the clinical outcomes of interest using the gene expression data, with genes being the feature, patients being the sample and clinical outcomes of patients being the class label. The major difficulty associated with classification with gene expression data lies upon the inherent "large p small n" nature of the high-throughput genomic data, whereby the number of samples is usually two orders of magnitudes smaller than the number of genes in a single transcriptomic prole, making it hard to extract reliable information.
In this project, I aim to integrate biological knowledge as a prior, in the forms of either biochemical pathways or protein interaction network, into genomic data, and propose bio-markers as functional gene sets that are of good diagnostic or prognostic value.
Education
Msc in Chemical Process Engineering, UCL, 2011
PhD in Chemical Engineering, UCL, 2011-present