A brief history of the Survey of English Usage

The Survey of English Usage ('the Survey') was founded in 1959 by Randolph (now Lord) Quirk. Many well-known linguists have spent time doing research at the Survey. Among them are: Valerie Adams, John Algeo, Dwight Bolinger, Noël Burton-Roberts, David Crystal, Derek Davy, Jan Firbas, Sidney Greenbaum, Liliane Haegeman, Robert Ilson, Ruth Kempson, Geoffrey Leech, Terttu Nevalainen, Jan Rusiecki, Jan Svartvik, Joe Taglicht and many others.

The 'Gang of 4' in the 70s
The 'Gang of Four' in the 1970s (left to right):
Quirk, Greenbaum, Svartvik and Leech.

The 'Gang of 5' in 1983
The 'Gang of Five' in 1983 (left to right):
Svartvik, Crystal, Greenbaum, Leech and Quirk.

The 'Quirk Corpus'

The million-word Survey Corpus, now complete, samples written and spoken British English produced between c.1955 and 1985. It comprises 200 texts, each of 5,000 words. The spoken texts include both dialogue and monologue, while the written texts include not only printed and manuscript material but also examples of English read aloud, as in broadcast news and scripted speeches.

The Survey in the 70s

The Survey Corpus was originally compiled on paper, in the form of many thousands of slips, with detailed grammatical annotations. This has now been computerized and each lexical item has been automatically tagged for wordclass. (It is available on the network of computers at the Survey premises. The original sound recordings may also be consulted at the Survey.) 

Hundreds of publications have used, and continue to use, material from the Survey Corpus, either in its original slip form or in its later computerized form, where it was known more familiarly as the London-Lund Corpus (LLC).

Foster Court in 1940s
Foster Court in the late 1940s -
the Survey premises are on the top right of the picture.

The International Corpus of English

Randolph Quirk was succeeded in 1983 by Professor Sidney Greenbaum, who was Director until 1996. The ICE project began in 1990, with the Survey responsible for the international coordination of the project and for the compilation of ICE-GB, the British component of the project. The Survey has produced the grammatical and syntactic annotation schemes for the ICE corpora as well as numerous software packages to support the compilation of the project. 

Bas Aarts took over as Director of the Survey in January 1997. The first release of ICE-GB took place in 1998. ICE-GB was distributed with software for searching and exploring the parsed corpus called ICECUP. Release 2 of ICE-GB has now been released and is available on CD (optionally with sound files).

The Diachronic Corpus of Present-Day Spoken English

A recent project at the Survey undertook the parsing of a large (400,000 word) selection of the spoken part of the LLC in a manner directly comparable with ICE-GB, forming a new, 800,000 word diachronic corpus, called the Diachronic Corpus of Present-Day Spoken English (DCPSE). DCPSE has now been released and is available on CD.

For more on our current research, see here.

This page last modified 14 May, 2020 by Survey Web Administrator.