XClose

UCL Centre for Digital Humanities

Home
Menu

Automated transcription of historical documents

14 February 2025, 2:00 pm–4:00 pm

Example of the transcription process in one crowdsourced cultural heritage project. A user transcribes an early modern letter. Shakespeare’s World

This interactive workshop provides an introduction on how to apply OCR and HTR technology for Digital Humanities projects, or in the cultural heritage sector more widely.

Event Information

Open to

All

Availability

Yes

Organiser

Marco Humbel, UCLDH Associate Director (ECR)

Location

t.b.c.
Marshgate Building, 7 Sidings St,
Stratford, London
E20 2AE

Digitisation efforts within the cultural heritage sector have led to an abundance of historical book and manuscript collections available online. Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are fundamental techniques to make old scripts readable for broad audiences, as well as opening textual collections up for computational analysis, such as text mining or network analysis. 

This interactive workshop provides an introduction on how to apply OCR and HTR technology for Digital Humanities projects, or in the cultural heritage sector more widely. By using the software Transkribus as a case-study this workshop is suitable for people without prior experience with automated transcription. The 2 hours workshop will cover:

-    How OCR and HTR technology works
-    Common tools for automated transcriptions
-    Quality assessment 
-    Introduction to Transkribus

Attendees should have basic computer skills, and we encourage to bring own digitised textual documents for automated transcription to the workshop. Please bring your own laptop and ensure you have access to UCL Wifi (https://www.ucl.ac.uk/isd/services/get-connected/wi-fi/uclguest)

Participation is free but registration is required: https://www.eventbrite.co.uk/e/1144065262999

The workshop is facilitated by Dr Marco Humbel (UCLDH/TU Darmstadt) and Dr Alicia Hughes (The British Museum)

Image credit: Example of the transcription process in one crowdsourced cultural heritage project. A user transcribes an early modern letter. Shakespeare’s World