The Survey of English Usage
Annual Report 2008

News
Research
Publications, conference presentations, etc.

The Annual Report for 2008 incorporates the newsletters published during the year.

1. News

1.1 Wikipedia

The Survey now has a Wikipedia entry which can be viewed here: en.wikipedia.org/wiki/Survey_of_English_Usage

Fifty years of the Survey of English Usage
ICLCE 3 Conference
1.2 The SEU’s fiftieth birthday celebrations

Preparations are underway to celebrate the fiftieth birthday of the Survey of English Usage on 14 July 2009 with a symposium entitled ‘Current Change in the English Verb Phrase’ organised by Bas Aarts, Jo Close and Geoffrey Leech. Speakers will include Geoffrey Leech, Stig Johansson, and Manfred Krug. The plenary lecture will be delivered by David Crystal. The symposium precedes the Third International Conference on the Linguistics of Contemporary English, from 15-17 July, which the Survey is organising at the Institute for English Studies, Senate House, with colleagues at Queen Mary, University of London. Click on the logo on the right to go to the Conference page.

The symposium and conference programme is now available.

1.3 ESRC: Next Generation Tools project rated ‘outstanding’

We are delighted to report that ESRC reviewers rated the now completed project Next generation tools for linguistic research in grammatical treebanks (R 000 23 1286) as ‘outstanding’. This means that the project “has fully met its objectives and has provided an exceptional research contribution well above average or very high in relation to the level of award”. The project’s website can be viewed here where the final report and reviewers’ comments can be read.

Back to top

2. Research projects

2.1 The changing verb phrase in present-day British English: progress report

This project has now been running for sixteen months with Dr Jo Close as the RA. Here we describe the progress of the research.

The work schedule, as shown in the project proposal, is as follows:

1. Preparation (4 months)

  • Read the literature on current change and become familiar with the methods and techniques of diachronic linguistics.
  • Create a bibliography of references; write a literature overview.
  • Training in the use of ICE-GB/DCPSE.
  • Set up a project website.
  • Attend courses.

2. Data collection (4 months)

  • Construct Fuzzy Tree Fragments for the various patterns and constructions.
  • Conduct searches using ICECUP.
  • Create a systematic and 'cleaned-up' database of examples (with context) for each of the research areas listed above.
  • Conduct statistical tests on the perceived changes occurring in the relevant constructions between the two components of DCPSE.

The first two phases have now been successfully completed. These phases provided a necessary and useful preparation to start the third phase of the project which will consist of conducting separate studies which address the proposed research questions.

The project website can be viewed from our site.

The following are now available online:

  • a summary of initial results;
  • conference handouts for the first study;
  • paper submitted for publication;
  • a bibliography of references;
  • a sample set of Fuzzy Tree Fragments with a sample database of examples.

Phase 3 has now begun. The first of the proposed studies, on English auxiliary verbs, has been completed, and the results were presented at the International Conference on English Historical Linguistics (ICEHL15) in Munich in August 2008. The PI and Professor Geoffrey Leech organised a workshop entitled 'Watching English Grammar Change' at the first conference of the newly established International Society for the Linguistics of English (ISLE) in Freiburg, Germany. We also presented our first results at that workshop. A paper on this topic has been submitted for publication to the proceedings of ICEHL.

The results from this case study indicate that changes in the verb phrase within in a 30 year period are clearly in progress. Specifically, we found that the decrease in must and the increase in have to are both statistically significant. The results highlight the usefulness and importance of corpus-based research because certain forms, which we might predict to be on the increase, are not as frequent as we might expect. For instance, epistemic have to and have got to are extremely rare, and even have got to with root meaning is in decline. The results have now been written up and will be submitted to the ICEHL proceedings.

The next study, on the English subjunctive, highlights the importance of using spoken data. Recent accounts on the English subjunctive have suggested that it is on the increase in British English. There is no evidence for this in the spoken DCPSE corpus, perhaps suggesting that the revival of the subjunctive is restricted to particular written genres. The results of this study have been written up as a paper which will be presented at the 30th annual conference of the International Computer Archive for Modern and Medieval English (ICAME) in May 2009 in Lancaster, and submitted to the conference proceedings. The next study, on verb complementation, will then begin.

As mentioned above, the PI and RA are organising a symposium with Geoffrey Leech preceding the Third International Conference on the Linguistics of Contemporary English (ICLCE3) in London in July 2009. The symposium is entitled 'Current Change in the English Verb Phrase'. We have attracted a large number of scholars to this symposium which will enable us to exchange ideas on methodology, and present the results of our work to date. We intend to publish the papers presented at the workshop in an edited volume.

2.2 Next generation tools for linguistic research in grammatical treebanks – new research directions

Alongside further software development, Sean Wallis reports that the 'next generation tools' research is leading to a number of new research directions, particularly in relation to investigating linguistic interaction in parsed corpora. He spoke about measuring linguistic interaction for sequences of decisions at an SEU seminar in the autumn term. This research concentrates less on the question of the particular reason why speakers may make a given choice than the fact that different sequential choices behave differently - that, 'in grammar everything seems to interact, but not everything interacts equally'. He has written this up as a paper for Language.

This question is part of some more general research Sean is undertaking into the linguistic interaction between pairs of similar grammatical choices. He is currently working on a posteriori models of case interaction and incorporating this type of model in ICECUP IV.

The idea of modelling 'case interaction' is to estimate the probability that any case taken from a passage is independent from other cases in the passage. Conventional statistics depends on the idea of a random sample, where every datapoint is taken randomly from the 'wild'. There should be no 'designed relationship' between datapoints. Once we cite examples and say they are frequent or otherwise, we are making much the same assumption. However, corpora conventionally consist of entire passages of text, and some sentences may contain multiple cases (e.g. NPs). As Church has famously pointed out (K.A. Church, 2000, 'Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2.' Proceedings of Coling-2000. 180-186.) the chance of a second instance of the same event occurring in a passage may be much higher than its chance of occurring in general language use.

An a posteriori model of case interaction can be constructed by measuring the degree by which one outcome affects another at a given 'distance' from other cases in the same sample. The question is then, what do we mean by 'distance'? In a parsed corpus 'distance' might be lexical or grammatical, we might count each intervening term as having the same 'size', etc. A grammatical model of distance depends on a model of grammar.

We are increasingly of the opinion that experiments with modelling case interaction can provide new insights into how grammar works. For example, Sean has demonstrated that a particular alternation (relative vs. non-finite clauses) is influenced by the value of a second alternation approximately twice as much when the two choices are co-ordinated than when they are not.

2.3 The International Corpus of English

Sean Wallis attended an International Corpus of English workshop organised by Survey alumnus Professor Gerald Nelson at the Chinese University of Hong Kong. He met with ICE teams to discuss the process of POS-tagging ICE corpora. He demonstrated how the ICE tagger may be used to tag the corpus and ICECUP can be used for correcting the tags, and discussed the steps that were necessary to start the effective tagging of ICE corpora. More information, including software downloads, are on our website.

Tagger (zip)
Discussion document (PDF)

2.4 ICECUP 3.1 and DCPSE

During the summer of 2008 the beta release of ICECUP IV was published, a new research platform for experimental research in parsed corpora, which is available for download from our website. ICECUP IV is compatible with ICE-GB Release 2 and DCPSE.

  • Users possessing a full CD copy of ICE-GB Release 2 or DCPSE can download the software entirely free of charge to carry out research on their corpus. § Anyone can download the two 20,000-word sample corpora with ICECUP IV.
  • ICECUP IV was developed under the ESRC Next Generation Tools research project. We are working towards a final release incorporating models of case interaction and additional analytical tools.

Back to top

3. Publications, conference presentations, talks, theses and other studies using Survey material

Please let us know if you would like us to include your publications based on SEU material. We will appreciate it if you send us offprints of any such publications.

Aarts, Bas (2008) Approaches to the English gerund. 2008. In: Graeme Trousdale and Nikolas Gisborne (eds.) Constructional explanations in English grammar. Topics in English Linguistics 57. Berlin and New York: Mouton de Gruyter. 11-31.

Aarts, Bas (2008) Chinese translation of 'Corpus construction: a principle for qualitative data collection'. (With Martin W. Bauer.) In: Martin W. Bauer and George Gaskell (eds.) Qualitative researching: with text, image and sound. London: Sage. Wunan Book Co., Taipeh, Taiwan. [http://www.wunan.com.tw/bookdetail.asp?no=9179]

Aarts, Bas (2008) The subjunctive conundrum. Plenary lecture the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Aarts, Bas (2008, with Jo Close) Current change in the modal system of English: a case study of must, have to and have got to. Paper presented at the 15th International Conference on English Historical Linguistics. Munich, Germany, 2008.

Aarts, Bas (2008, with Jo Close) Must and its rivals in the Diachronic Corpus of Present-Day Spoken English. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Aijmer, Karin (2008) Well in a social and regional context. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Aijmer, Karin (2008) Modal adverbs in interaction - obviously and definitely in adolescent speech. In: Nevalainen et al. (2009) (eds.), 61-83.

Bogaert, Julie van (2008) The interrogative mental state predicates do you think, do you suppose and do you believe: a corpus-based synchronic study of grammaticalization. Paper presented at New Reflections on Grammaticalization 4, Leuven, Belgium.

Brook O'Donnell, Matthew (2008) "From corpus to query and back again": The Survey of English Usage and ICECUP paradigm. International Journal of Corpus Linguistics 13.3. 387-401.

Close, Jo (2008, with Bas Aarts) Current change in the modal system of English: a case study of must, have to and have got to. Paper presented at the 15th International Conference on English Historical Linguistics. Munich, Germany, 2008.

Close, Jo (2008, with Bas Aarts) Must and its rivals in the Diachronic Corpus of Present-Day Spoken English. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Collins, Peter (2008) The English modals and semi-modals. In: Nevalainen et al. (2009) (eds.), 129-145.

Dehé, Nicole and Anne Wichmann (2008) The prosody of initial comment clauses -evidence for main clause or discourse status. Paper presented at New Reflections on Grammaticalization 4, Leuven, Belgium.

Depraetere, Ilse and An Verhulst (2008) Source of modality: a reassessment. English Language and Linguistics 12.1, 1-25.

Diaconu, Gabriela (2008) Modality and modal verbs in the new Englishes. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Gries, Stefan Th. (2008) Corpora and Grammar. In: A. Lüdeling and M. Kytö, Corpus Linguistics, Handbooks of Linguistics and Communication Science (HSK Series), Berlin and New York: Mouton de Gruyter. 933-951.

Gries, Stefan Th. and Anatol Stefanowitsch (2008) Channel and constructional meaning: a collostructional case study. In: G. Kristiansen and René Dirven (eds.). Cognitive sociolinguistics. Berlin, New York: Mouton de Gruyter, 129-52.

Gries, Stefan Th. and Martin Hilpert (2008) The identification of stages in diachronic data: variability-based neighbor clustering. Corpora 3.1, 59-81.

Hoffmann, Thomas (2008) English relative clauses and construction grammar: a topic which preposition placement can shed light on? In: Graeme Trousdale and Nikolas Gisborne (eds.) Constructional explanations in English grammar. Topics in English Linguistics 57. Berlin and New York: Mouton de Gruyter. 77-112.

Kaltenböck, Gunther (2008) On the multifunctionality of parenthetical I think. Paper presented at the the 29th ICAME conference, Ascona, Switzerland.

Kolbe, Daniela and Benedikt Szmrecsanyi (2008) Complementizer choice in written and spoken English. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Kortmann, Bernd and Benedikt Szmrecsanyi (2008) Analyticity and syntheticity in L2 varieties and learner varieties of English. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Leech, Geoffrey (2008, with Nicholas Smith) Changing patterns of grammatical frequency over the 20th century: evidence from comparable corpora of the Brown family 1901-1991. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Mair, Christian (2008) Right in the middle of the s-shaped curve: on the spread of specificational clefts in 2oth century English. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Mukherjee, Joybrato and Marco Schilk (2008) Verb-complementation profiles across varieties of English. In: Nevalainen et al. (2009) (eds.), 163-181.

Nevalainen, Terttu, Irma Taavitsainen, Päivi Pahta (2008) Exploring the dynamics of linguistic variation through public and private corpora. In: Nevalainen et al. (2009) (eds.), 1-15.

Nevalainen, Terttu, Irma Taavitsainen, Päivi Pahta and Minna Korhonen (2008) (eds.) The dynamics of linguistic variation: corpus evidence on English past and present. Studies in Language Variation 2. Amsterdam: John Benjamins, 2008. Pp. viii+339. Hardback €105/$158, ISBN 978 90 272 3482 7.

Pearce, Michael (2007) The Routledge dictionary of English language studies. London: Routledge.

Peters, Pam (2008) Patterns of negation. In: Nevalainen et al. (2009) (eds.), 147-162.

Sand, Andrea (2008) Angloversals? In: Nevalainen et al. (2009) (eds.), 183-202.

Schneider, Agnes (2008) The expression of past time reference in spoken L2 varities of English. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Trousdale, Graeme and Thomas Hoffmann (2008) Variation, change and constructions in English: theory and method. Paper presented at the First International Conference of the International Society for the Linguistics of English (ISLE). Freiburg, Germany.

Wallis, Sean (2008) Searching treebanks and other structured corpora. In: A. Lüdeling, M. Kytö, Corpus linguistics: an international handbook, Handbooks of Linguistics and Communication Science (HSK Series), Volume 1, Berlin and New York: Mouton de Gruyter. 738-759.

Wichmann, Anne (2008) Speech corpora. In: A. Lüdeling, M. Kytö, Corpus linguistics: an international handbook, Handbooks of Linguistics and Communication Science (HSK Series), Volume X, Berlin and New York: Mouton de Gruyter. 187-207.

Back to top

Bas Aarts
Director

March 2009

This page last modified 21 July, 2014 by Survey Web Administrator.