Archive for December, 2010

Transcribe Bentham makes the New York Times

By UCLDH, on 27 December 2010

UCLDH’s very own Transcribe Bentham project gets written up in the New York Times:

Starting this fall, the editors have leveraged, if not the wisdom of the crowd, then at least its fingers, inviting anyone — yes, that means you — to help transcribe some of the 40,000 unpublished manuscripts from University College’s collection that have been scanned and put online. In the roughly four months since this Wikipedia-style experiment began, 350 registered users have produced 435 transcripts.

These transcripts, which are reviewed and corrected by editors, will eventually be used for printed editions of the collected works of Bentham, whose preserved corpse, clothed and seated, has greeted visitors to the college since 1850.

Other initiatives have recruited volunteers online, but the Bentham Project is one of the first to try crowd-sourced transcription and to open up a traditionally rarefied scholarly endeavor to the general public, generating both excitement and questions.

Read the full article.

Claire Warwick on “culturomics”

By Rudolf Ammann, on 17 December 2010

Claire Warwick gets quoted in The Guardian on a new project in which Harvard University and Google open millions of digitised books to quantitative analysis:

Claire Warwick, director of the Centre for Digital Humanities at University College London, said that humanities researchers had been using the word-frequency techniques being described by Michel and Aiden for several decades. But the sheer size of their dataset marked it out from the usual tools. “What’s different is that this allows people to not just look at several hundred thousand words or several million words but several million books. So the overview is much bigger. That may bring out some hitherto unexpected ideas.”

The database of 500bn words is thousands of times bigger than any existing research tool, with a sequence of letters that is 1,000 times longer than the human genome. The vast majority, around 72%, is in English with small amounts in French, Spanish, German, Chinese, Russian, and Hebrew.

“In science, huge datasets which people have used super-computing on have led to some fascinating new discoveries that otherwise wouldn’t be possible,” said Warwick. “Whether that’s going to be the same in the arts and humanities, I don’t know yet.”

The scanned books can now be mined for cultural trends with very little effort using Google’s Ngram Viewer:

“One of the ways to use this is to suggest ideas,” said Warwick. “You can look at something like this and say, how fascinating that a certain term seems to occur so commonly and I wonder why that should be.”

On 17 March, Claire will deliver a public Lunch Hour Lecture at UCL on Twitter and Digital Identity.

QRator Project

By Claire Ross, on 16 December 2010

UCLDH has a new project, small, but hopefully perfectly formed.

We are working with UCL Museums and Collections and CASA on a project Called QRator.

With the help of UCL’s Public Engagement Unit Innovation Seed funding the QRator project is exploring how handheld mobile devices, QR codes and interactive digital labels can create new models for public engagement, personal meaning making and the construction of narrative opportunities inside museum spaces. The project aims to engage members of the public within the Grant Museum by allowing them to become the ‘Curators’.

The project aims to work with UCL museums to become a true forum for academic-public debate, using low cost, readily available technology, enabling the public to collaborate and discuss museum concepts and object interpretation with museum curators, academic researchers and each other.

The Grant Museum has some really brilliant specimens, a playful attitude and a refreshing outlook for pushing the boundaries of how museums should/could behave. The team behind CASA’s wicked Tales of Things project are providing the technical knowhow and the development and UCLDH is undertaking the user evaluation.

You can find out a bit more about the QRator project at the UCL Public Engagement site

CELM: Summary and reflections on London seminar #3

By Claire Warwick, on 6 December 2010

Despite the snow we had a remarkably good turnout for the third London Seminar in Digital Text and Scholarship, which proved thoroughly enjoyable. Thank you to everyone who got there! However, some of our usual attendees couldn’t make it but were sufficiently intrigued, when following the tweets, to want to find out more about what was said. As a result, Henry Woudhuysen has kindly agreed to produce a summary of his talk, including some thoughts on how CELM might develop in future. This may not be as good as being there, but I hope it’s the next best thing:

In 1966, Peter Beal, a graduate of the University of Leeds, started work on the Index of English Literary Manuscripts, 1450-1700 (IELM). The first volume of what was originally meant to be a one-year project appeared in 1980. Its two parts covered the years 1450 to 1625 and were followed, in 1987 and 1993, by a further volume in two parts taking the coverage up to 1700. For the first time, English Renaissance scholars had a full catalogue of the manuscripts – autograph and scribal – of the major authors of the period. The Index included writings in verse, prose, dramatic, and miscellaneous works, including letters, documents, books owned, presented, and annotated by the authors, and related items. In 23,000 entries, Peter Beal covered the works of 128 authors – of these two were women: his choice was determined by a decision to base the whole project (further volumes covered the eighteenth and nineteenth centuries) around authors with entries in The Concise Cambridge Bibliography of English Literature (1974). Each author’s entry begins with a valuable introduction giving an overview of the surviving material.

The project initiated a series of investigations into what has come to be known as ‘scribal publication’, and this phenomenon in itself has contributed an important element to the study of the history of the book in Britain. Following the Catalogue’s publication, Peter Beal continued to collect material relating to the authors whose manuscripts he had already described, and by the early years of the new century he was ready to find a way of updating the Index. A proposal in 2004 to the Arts and Humanities Research Board (later Council) for a five-year project to create an enhanced digital version of the Index was successful, and the following year work began on The Catalogue of English Literary Manuscripts, 1450-1700 (CELM). Since then, Peter Beal has continued his researches, assisted by John Lavagnino of King’s College London’s Centre for Computing in the Humanities, who has acted as the project’s technical advisor, while I have acted as its general overseer, along with a distinguished international advisory panel .

CELM will cover the work of around 200 authors (60 of them women) in some 40,000 entries. The author entries range in length from having no items (Emilia Lanier and Isabella Whitney) to having only one (Thomas Deloney and Sir Thomas Elyot), to having around 4,500 (John Donne). A conference relating to the project was held at King’s in the summer of 2009, when the database was shown to a number of scholars, and the Catalogue will be launched online as an open-access resource at a larger event in the summer of 2011.

Work on CELM began with keyboarding all the entries in IELM, turning the contents of the books into a database. In many ways, CELM is instantly recognisable as a digital version of IELM. However, whereas IELM was solely based around authors, CELM has a repository view as well as an author view – both are available in longer and shorter forms. The repository view allows the user to see what is available in some 500 locations from Aberdeen University Library to the Zentralbibliothek, Zürich, Switzerland, by way of numerous Private Owners and Untraced items. Even though the repository view only contains descriptions of items by CELM’s authors, it is a major step towards producing what is in effect a short-title catalogue of English manuscripts of the period.

Much thought has been given to the question of tagging material and to the possibilities of full-text searching. For example, it will be easy to find some specific literary genres, such as verse letters or epigrams by text searching, but other ‘hidden’ categories, such as women, scribes, compilers, owners, collectors, composers, dealers, bindings and binders will remain elusive unless tagged. The project has enormous scope for further development. It might, for example, supply links to library home pages and their catalogues and to related digital projects such as ODNB, Perdita, and the Electronic Enlightenment. Most importantly of all, there are several areas where CELM offers a valuable starting point for further research: for example, into paper and bindings, auction and booksellers’ catalogues, the history of scribal publication, literary genres, authorship, and collecting. One obvious development would be to link entries to images of the material that is being described, while in time it is hoped that full descriptions of each manuscript referred to can be created. There is scope for more work on as yet unvisited repositories, as well as for including more authors and literary types, especially anonymous works. Some thought has already been given to how to maintain the website and how to signal the addition of new material to users.

What began as a simple one-year survey of what was thought to be quite a limited field has grown, through Peter Beal’s extraordinary labours, into a vast digital project that will be essential to the work of all scholars of the period.

H.R. Woudhuysen