At the intersection of clinical research and computer science, the Data Lab seeks to develop data-driven computational methods and tools for exploiting electronic health records for biomedical research. The Lab is involved in the development of novel methods, tools and scripts, and preparation of data from electronic health records for research.
The Data Lab team, consisting of experienced data scientists and statisticians, has extensive expertise in the curation and analysis of large, linked health and administrative datasets, and is involved across all stages of the project, from providing initial advice, development of disease definitions and classifications through to extraction and release of data onto the data safe haven.
Head of the Data Lab team
Currently leading the data workflow in the CALIBER research platform.
With a background in Applied Statistics, Epidemiology and Public Health (MSc) and Biostatistics (PhD), his areas of expertise range from disease epidemiology and healthcare utilisation in observational studies, to data methods for clinical phenotyping using electronic health data at national and international levels.
Michalis is a medical statistician, working at the Institute of Health Informatics, at UCL as a senior research fellow.
He holds a BHF Immediate Postdoctoral Basic Science Research Fellowship (Feb 2018 - Jan 2022) entitled “Weight change and the onset and progression of cardiovascular diseases in large scale electronic health records”.
He is particularly interested in causal inference. His intention is to bring together the worlds of epidemiology and statistical methodology and highlight the potential benefit for public health from the appropriate analysis of observational data.
Ana has a Phd in Cognitive Psychology and worked as Marie Curie Research Fellow in Cognitive Neuroscience in UCL. She has an MSc in Data Science for Research in Health and Biomedicine and she is interested in phenotyping methods using linked electronical health records. She is a Research Fellow in Health Data Science at the Institute of Health Informatics in UCL where she is working in an industry funded project to create and develop phenotype algorithms in the UK Biobank.
Arfeen is working as Data Manager for IHI-UCL. He is responsible for coordinating databases, storing, organizing, securing, and accessing information. Arfeen is passionate about making data management more efficient and effective. His role is to manage the ongoing operation of the EHR databases. Analyze large volumes of clinical data to identify trends and quality of data submitted. He mostly collaborates with a support team and clinical research groups to ensure data integrity and data exchange. Arfeen is highly proficient in the use of computer programs and applications to make raw data more useful to the department and research group.
Cai holds an MSc in Health Data Science from UCL and is expanding from her dissertation work analyzing prescribing trends post-COVID infection in the UK Biobank. As a research associate, she contributes to the development and analysis of phenotyping algorithms in the UK Biobank and beyond.
She is also an expert in geochemical quantification with laser-induced breakdown spectroscopy (LIBS) and works as a spectroscopy data scientist consultant in industry, academia, and for national planetary organizations
Vaclav has worked in the fields of health and neuro informatics for more than seven years now and has spent the last four years as a research associate at the Institute of Health Informatics at UCL. Previously, he worked as a junior researcher at the Department of Computer Science and Engineering of the University of West Bohemia in Pilsen, Czech Republic, whereby he obtained a PhD with a thesis on Archetype-based approach for modelling of electroencephalographic/event-related potentials data and metadata.
Having a professional background as a computer scientist and software engineer, his primary interest is in data models, database technologies (relational as well as non-relational) and linked data / semantic web.
Project management and facilitation
Natalie has more than 20 years’ experience managing large research programmes involving linked electronic health record (EHR) data. She is responsible for facilitating research collaborations for CALIBER including governance and access to data. Natalie co-leads the UCL Institute of Health Informatics (IHI) Phenomics Group and is programme manager for the Health Data Research (HDR) UK Phenomics Implementation Projects to develop the HDR UK CALIBER Phenotype Portal, an open resource for EHR users to share their methods and tools, and build the UK's natural language processing (NLP)
Vauvelle is a PhD student on the AI Enabled Healthcare Systems CDT and is sponsored by BenevolentAI. After working with the NHS and healthcare data through startups and consultancy, he found many healthcare problems that could benefit from further academic study. His research focuses on developing new machine learning methods and tools for computational phenotyping with structured EHR data.
Albert Henry is a PhD student registered with the BHF 4-year PhD in Cardiovascular Biomedicine programme at UCL Institute of Cardiovascular Science. He is a fully trained clinician (general practice) from Indonesia and holds an MSc degree in Health and Biomedical Data Science from UCL Institute of Health Informatics. His current PhD research focuses on studying the genetics of heart failure and heart failure subtypes using large-scale genomics, molecular profile, clinical assessments, and electronic-health record data. Outside research, he co-leads the IHI Code Club, a volunteer-led initiative aiming to promote reproducibility and good coding practice across research communities.
Nonie is a PhD student working on clustering in EHR to find hidden subtypes of heterogenous diseases. She has also worked on projects looking into unfair bias in health data.
Maria is a Data Scientist by training. She enjoys working in the multidisciplinary IHI environment with clinicians, epidemiologists, geneticists, and statisticians, applying traditional as well as machine learning methods to tackle research questions using electronic health records.
In 2018, Maria was a warded the Joseph Footit British Lung Foundation Grant for COPD research. Using data and tools from the CALIBER resource, she is currently investigating airway disease subtypes, with the aim to improving the quality and personalising care for those living with COPD, asthma and bronchiectasis.
Maria has now taken a break from full time research to pursue a graduate entry medicine course.
Ghazaleh is a genetic epidemiologist, working at the Institute of Health Informatics. She holds an American Heart Association Research Fellowship in Health Data Science. Her fellowship is focused on using machine learning algorithms to identify and validate clinically meaningful subtypes of heart failure using linked EHR and genetics data in the UK Biobank.
Cécile is a Research Administrator on secondment at the Institute of Informatics. She previously was the administrator at the Centre for Critical Heritage studies at the Institute of Archaeology and also worked at the Thomas Coram Research Unit (Institute of Education) supporting the ERC-funded Families and Food in Hard Times study. Prior to joining UCL, Cécile supported FP7-EC funded projects at City, University of London and LSHTM.