Research Resources

The Survey of English Usage carries out research in English language Corpus Linguistics. We construct corpora, develop tools and methodologies, and carry out original research into the English language itself. Our recent and current research is summarised on our Research Projects pages. This section of our website is concerned with the dissemination of reference material and other products of our research.

Parsed Corpora

The Research Projects part of the site summarises two major parsed corpora of English.

ICE-GB is the British Component of the International Corpus of English, containing samples of written and spoken contemporary (early 1990s) English. ICE-GB is now available in a Release 2 version with updated software and optionally, aligned digital audio.

DCPSE is the new parsed corpus of spoken English, containing samples from the late 1960s to early 1990s.

These resources are now available.

Order corpora

Grammatical Schema

The grammar employed in ICE-GB and DCPSE is summarised in the TOSCA/ICE Grammar pages. This is a simple glossary. For more detail, we suggest you download the full ICECUP help file from our site.

For a general introduction to English Grammar, see the Internet Grammar of English site. IGE (also available for purchase as a CD) is an entire website undergraduate-level course in English Grammar.

Corpus Research Tools

ICECUP 3.1 is our state-of-art corpus exploration platform designed from the ground up for disseminating, exploiting and carrying out research with parsed corpora.

Click on a link below to download the software from our website. Both download packages come complete with 20,000-word sample corpora and on-line help.

ICE-GB R2 sample
DCPSE sample

ICECUP IV β

ICECUP IV extends ICECUP 3.1 into a corpus experimentation platform. It was developed under the Next Generation Tools project and is available as a beta release from our website.

Grammatical Query Methodologies

Fuzzy Tree Fragments are structured queries designed for parsed corpora. FTFs were initially developed in the Corpus Query project and have been extended for the release of ICECUP 3.1.

The FTF pages describe Fuzzy Tree Fragments in some detail as well as explain how they can be used in carrying out experiments in grammar using ICECUP. Supporting linguistic experimentation in software, using FTFs, is the subject of a research project.

Sean Wallis has also published a number of articles on statistical methods for corpus linguists on his blog, corp.ling.stats.

This page last modified 12 June, 2013 by Survey Web Administrator.