UCL Department of Biochemical Engineering


Introduction to Data Science and Python for Biomanufacturing

Learn data analytic tools supported by multivariate data analysis (MVDA) and machine learning (ML) expertise to better leverage useful and actionable information from complex bioprocessing data sets.

The course will provide an introduction into the most important statistics and data exploration tools necessary to evaluate challenging bioprocess data sets. In addition, support the user to navigate which MVDA or ML algorithm should be implemented to better understand the cause and effect relationships within their data and how to apply these tools to generate predictions of key process parameters.

This course will provide the delegate with the following skillset:

  • Practical data analytic tools in Python: Learn the basics of Python by analysing real-world data sets
  • Multivariate data analysis (MVDA) and Machine learning (ML) expertise: Build and validate advanced process models on challenging bioprocessing data sets
  • “Big Data” analytics: How to deal with large manufacturing data sets and automate the model building process
  • Bioprocessing expertise and knowledge: Learn the most important statistics and data exploration tools necessary to interpret challenging biologic data sets

The aim of the course is provide all attendees with the necessary skill set to better leverage useful and actionable information from complicated bioprocessing data sets.

The course will utilise a wide range of real-life industrial data sets collected from both upstream and downstream operations and will demonstrate all the necessary data importation, pre-processing, visualization and analysis steps to better inform bioprocess monitoring and control operations.

This MBI is recommended for:

Anyone who is keen to better understand their data, minimise repetitive visualisation and analysis tasks and learn the key fundamentals of data science. The course will be of great benefit to:  

  • Scientists
  • Engineers
  • Project managers
  • Anyone who currently analyses and visualises their data in Excel
Module Leader

Stephen Goldrick is a lecturer in Digital Bioprocess Engineering specialising in the application of advanced data analytics and mathematical modelling to the biotechnology sector.

He graduated with a BEng in Chemical and Bioprocess Engineering at University College Dublin in 2008 and continued his studies to obtain a MEngSc related to nanoparticle synthesis as part of a joint venture between Rice University and University College Dublin in 2010. He completed his EngD titled “Application of Multivariate Data Analysis and First Principle Mathematical Modelling to the Biotechnology sector” in 2014 awarded by Newcastle University.  He subsequently started working as PDRA in UCL Biochemical Engineering department as part of the UCL-AstraZeneca Centre of Excellence applying data analytics to better leverage useful insights from large complex biomanufacturing data sets.  He became a lecturer in UCL Biochemical Engineering department in 2020 with a continued focus on Big-Data analytics and Machine Learning applications to the Biotechnology sector.

Programme Outline

This MBI will be supported by multiple industrial and academic speakers will include expert lectures on the following topics:

  • Process Analytic Technology within USP – Opportunities and challenges 
  • Missing data – best practices and common pitfalls 
  • Modelling chromatography columns – Key control objectives
  • Root cause analysis (RCA) – Exploratory analysis of key influential variables
  • Cell line optimisation- Application of artificial intelligence (AI) to automate selection of lead clones