
|
 |
A brief history of the Survey of English Usage
The Survey of English Usage ('the Survey') was founded in 1959
by Randolph (now Lord) Quirk. Many well-known
linguists have spent time doing research at the Survey. Among them
are: Valerie Adams, John Algeo, Dwight Bolinger, Noël Burton-Roberts,
David Crystal, Derek Davy, Jan Firbas,
Sidney Greenbaum, Liliane Haegeman,
Robert Ilson, Ruth Kempson, Geoffrey Leech,
Jan Rusiecki, Jan Svartvik, Joe Taglicht
and many others.
The 'Quirk Corpus'
The million-word Survey Corpus, now complete, samples written and
spoken British English produced between c.1955 and 1985. It comprises
200 texts, each of 5,000 words. The spoken texts include both dialogue
and monologue, while the written texts include not only printed
and manuscript material but also examples of English read aloud,
as in broadcast news and scripted speeches.
The Survey Corpus was originally compiled on paper, in the form
of many thousands of slips, with detailed grammatical annotations.
This has now been computerized and each lexical item has been automatically
tagged for wordclass. (It is available on the network of computers
at the Survey premises. The original sound recordings may also be
consulted at the Survey.)
Hundreds of publications have used, and continue to use, material
from the Survey Corpus, either in its original slip form or in its
later computerized form, where it was known more familiarly as the
London-Lund Corpus (LLC).
The International Corpus of English
Randolph Quirk was succeeded in 1983 by Professor Sidney Greenbaum,
who was Director until 1996. The ICE project began in 1990, with
the Survey responsible for the international coordination of the
project and for the compilation of ICE-GB,
the British component of the project. The Survey has produced the
grammatical and syntactic annotation schemes for the ICE corpora
as well as numerous software packages to support the compilation
of the project.
Bas Aarts took over as Director of the Survey in January 1997.
The first release of ICE-GB took place in 1998. ICE-GB was distributed
with software for searching and exploring the parsed corpus called
ICECUP. Release 2 of ICE-GB has
now been released and is available on CD (optionally with sound
files).
The Diachronic Corpus of Present-Day Spoken English
A recent project at the Survey undertook the parsing of a large
(400,000 word) selection of the spoken part of the LLC in a manner
directly comparable with ICE-GB, forming a new, 800,000 word diachronic
corpus, called the Diachronic Corpus of Present-Day Spoken English
(DCPSE). DCPSE has now been released
and is available on CD.
For more on our current research, see here.
This page last modified
20 October, 2008
by Survey Web Administrator.
|
 |
|