XClose

UCL Centre for Digital Humanities

Home
Menu

Visualizing the Transcribe Bentham corpus

06 December 2016, 5:30 pm–6:30 pm

UCLDH seminar logo, purple

Event Information

Open to

All

Organiser

UCLDH

Location

UCL Centre for Digital Humanities
Gower Street
LONDON
WC1E 6BT
United Kingdom

How can we gain an overview of the 17,000 pages of Bentham's manuscripts made available by Transcribe Bentham? Methods to provide an overview of the corpus may help domain-experts find corpus areas relevant for their research. In this work we have applied computational techniques to visualize the corpus, providing a general view of its content.

First, a lexical extraction was performed to choose terms to model the corpus. Then, term clusters were created based on similarity between the terms' contexts, and visualized as corpus maps. The maps provide an overview of the corpus as a whole, as well as of corpus terms more prominent in different corpus periods. The issue of evaluating these corpus maps will also be discussed.

All welcome and there will be drinks and discussion after the talk. Please note that registration is required.

The notes for this seminar are available below:

Speaker

Pablo Ruiz is a PhD Student in Natural Language Processing for Digital Humanities at the École Normale Supérieure in Paris.