Automated transcription of historical documents
14 February 2025, 2:00 pm–4:00 pm
This interactive workshop provides an introduction on how to apply OCR and HTR technology for Digital Humanities projects, or in the cultural heritage sector more widely.
Event Information
Open to
- All
Availability
- Yes
Organiser
-
Marco Humbel, UCLDH Associate Director (ECR)
Location
-
t.b.c.Marshgate Building, 7 Sidings St,Stratford, LondonE20 2AE
Digitisation efforts within the cultural heritage sector have led to an abundance of historical book and manuscript collections available online. Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) are fundamental techniques to make old scripts readable for broad audiences, as well as opening textual collections up for computational analysis, such as text mining or network analysis.
This interactive workshop provides an introduction on how to apply OCR and HTR technology for Digital Humanities projects, or in the cultural heritage sector more widely. By using the software Transkribus as a case-study this workshop is suitable for people without prior experience with automated transcription. The 2 hours workshop will cover:
- How OCR and HTR technology works
- Common tools for automated transcriptions
- Quality assessment
- Introduction to Transkribus
Attendees should have basic computer skills, and we encourage to bring own digitised textual documents for automated transcription to the workshop. Please bring your own laptop and ensure you have access to UCL Wifi (https://www.ucl.ac.uk/isd/services/get-connected/wi-fi/uclguest)
Participation is free but registration is required: https://www.eventbrite.co.uk/e/1144065262999
The workshop is facilitated by Dr Marco Humbel (UCLDH/TU Darmstadt) and Dr Alicia Hughes (The British Museum)