Visualizing the Transcribe Bentham corpus
06 December 2016, 5:30 pm–6:30 pm
Event Information
Open to
- All
Organiser
-
UCLDH
Location
-
UCL Centre for Digital HumanitiesGower StreetLONDONWC1E 6BTUnited Kingdom
How can we gain an overview of the 17,000 pages of Bentham's manuscripts made available by Transcribe Bentham? Methods to provide an overview of the corpus may help domain-experts find corpus areas relevant for their research. In this work we have applied computational techniques to visualize the corpus, providing a general view of its content.
First, a lexical extraction was performed to choose terms to model the corpus. Then, term clusters were created based on similarity between the terms' contexts, and visualized as corpus maps. The maps provide an overview of the corpus as a whole, as well as of corpus terms more prominent in different corpus periods. The issue of evaluating these corpus maps will also be discussed.
All welcome and there will be drinks and discussion after the talk. Please note that registration is required.
The notes for this seminar are available below:
Speaker
Pablo Ruiz is a PhD Student in Natural Language Processing for Digital Humanities at the École Normale Supérieure in Paris.