Corpus Queries
Development of an effective grammatical query methodology in the
context of a parsed corpus
funded by

Ref: R 000 22 2598
Institution: University College London
Department: Department of English (Survey of English Usage)
Investigator: Sean Wallis
Period: 1 March 1998 to 31 January 1999 (leave of absence
in November 1998)
Original aims and objectives
In recent years, corpus linguistics has developed dramatically,
due to increased computing power and improvements in annotation
software. This has precipitated a growth in the scale and complexity
of corpora, including the new grammatically annotated ICE-GB corpus.
Text corpora have been used both to improve software tools, such
as grammatical parsers, and to improve our understanding of language.
The research is to develop a linguistically plausible and transparent
method of forming queries for grammatical corpora.
The proposal is to use fragments of grammatical trees as the main
representation for queries. These "fuzzy tree fragments"
appeal because of the obvious parallel with familiar grammatical
structure. The difference is that a query must capture both what
is known and what is unknown: some components and relations may
be ommitted or "fuzzy". Developing this notion of "fuzziness"
is a major part of the research.
Complex queries may then be constructed by combining these tree
fragments with sociolinguistic variables using a logical language.
This project will run concurrently with the first release of the
ICE-GB corpus, and an early prototype of
the system will be provided at this point. Feedback from end users
will be used to aid further development.
Comment
Although this project was very modest in duration and scope, the
results proved to be extremely important and influential. The Corpus
Query project permitted the development of Fuzzy Tree Fragments
and ICECUP 3.0. The software was indeed published alongside ICE-GB
Release 1 in 1998, and has continued to improve ever since.
See also
Research Results
Fuzzy Tree Fragments
ICECUP software
This page last modified
23 October, 2009
by Survey Web Administrator.
|