Bootcamp: Data Science for Non-Data Scientists
Join Gavin Chait for a week-long, hands-on workshop introducing PhD students, post-doctoral researchers and all academics to core skills in data ethics, curation, analysis and presentation.

Booking Information
Spaces are limited and early registration is advised. Participants must commit to attending all five days of the bootcamp.
Data has become the most important language of our era, informing everything from intelligence in automated machines, to predictive analytics in medical diagnostics. The plunging cost and easy accessibility of the raw requirements for such systems – data, software, distributed computing, and sensors – are driving the adoption and growth of data-driven decision-making.
However, the fundamental skills required to lead algorithmic decision-making are not universally taught, and are often surrounded with unnecessary complexity.
As it becomes ever-easier to collect data about individuals a diverse range of professionals, who have never been trained for such requirements, grapple with inadequate analytic and data management skills, as well as the ethical risks arising from personal data possession and opaque algorithmic tools.
The key to unlocking data reuse, and new economic and social development opportunities from these data, rely on both data producers, and data users, having technical insight necessary to manage those who work with data, and a conscious and motivated understanding of the new algorithmic tools available to us.
This bootcamp is open to all UCL PhD students, post-doctoral researchers and academics.
Workshop Structure
Data Science for non-data scientists guides learners to confidence in the curation, ethics, analysis and presentation of data. Each day of the five-day course is an individual lesson guided by the following four topics:
- Ethics: determine the social and behavioural challenges posed by a research question.
- Curation: establish the research requirements for data collection and management.
- Analysis: investigate, explore and analyse research data.
- Presentation: prepare and present the results of analysis to promote a response.
Learning Outcomes
Each lesson will guide participants through review of a question requiring a time-constrained response, and with multiple competing ethical, technical and management considerations. Each day will conclude with teams competing to persuade the class of the conclusions they have reached.
- Identify concepts in ethical reasoning which may influence our analysis and results from data.
- Understand the process of data curation, and the custodial duty of data science.
- Investigate and review data to learn its metadata, shape and robustness.
- Identify an appropriate chart and present data to illustrate its core characteristics.
- Recognise the importance and process for applying concepts of privacy and anonymity.
- Integrate methods for metadata and archival into data management.
- Investigate data distribution and confidence.
- Illustrate core analysis with histograms and box plots.
- Determine the implications in the collection, mining and recombination of open- and digital data.
- Employ methods for presenting data for synthesis and usage, and employing methods for data maintenance.
- Assess techniques in randomness and probability to understand distribution and likelihood.
- Investigate histograms, line charts and scatter plots to illustrate probability.
- Acknowledge the privacy and confidentiality issues in data storage and security of personal data.
- Recognise responsibilities and mechanisms for securing data-at-rest and data-in-motion.
- Consider linear and continuous sampling methods to assess normal distributions.
- Present distributions as normal histograms and continuous curves.
- Integrate the lessons learned in a live simulation to persuade others to action.
Further information
Ticketing
Open
Cost
Free
Open to
UCL staff
Availability
Yes