
|
 |
Jan Svartvik
Serendipitously, browsing through the departmental library [at
Uppsala], my eyes one day fell on a copy of English Studies with
an article called "Relative clauses in educated spoken English"
by a certain R. Quirk. I was instantly hooked by this approach:
studying English that was contemporary - and spoken at that, based
on a corpus of audio recordings, both surreptitious and non-surreptitious - it even smelled of cloak and dagger. Armed with a British Council
scholarship, largely thanks to the support of my two Uppsala professors,
Erik Tengstrand in language and H. W. Donner in literature, I left
for north-east England to spend the 1959-1960 academic session at
the University of Durham under the guidance of R. Quirk (1957).
It's an amusing coincidence that the location of both my American
and British universities carries the same name (but of course with
different pronunciations). My private life at Durham, UK, was also
different from that of Durham, USA. I was now married, and this
created a problem for university administration. The solution was
to make me "a member of the Senior Common Room", as the term went.
Fortunately, suitable lodgings were found at Lumley Castle in Chester-le-Street,
halfway between Durham and Newcastle. My wife Gunilla (aged 21)
and I (aged 28) found ourselves the inhabitants of a medieval castle
dating back to 1348 ("that's a hundred years before Columbus", American
visitors were told), surrounded by a ha-ha and adjoining
a golf course, which led to a long but unrequited love affair with
the game of golf. Academically, on festive college occasions on
Palace Green, I experienced circulating (clockwise of course) a
port decanter, and weekly enjoying tutorials with Randolph Quirk,
an inspiring mentor with an enviable zest for work, who taught me
the virtues of close corpus observation and a broad grounding in
theoretical approaches, including British Firthian, American structuralist
(Sapir, Bloomfield, Zelig Harris, Gleason, Hockett, Pike, Archibald
Hill and James Sledd), Jan Firbas and other members of the Prague
School and, of course, the great Dane Jespersen (who lived in Helsingör,
a town which, on a clear day, I can now glimpse from the Swedish
shoreline). Leaving Durham with the embryo of a thesis which was
delivered in Uppsala in the spring of 1961, I was decorated with
a licentiate degree. A week after submitting the thesis, I received
a letter from Randolph Quirk offering me an assistantship on the
Survey of English Usage at University College London where he had
moved from Durham - an offer which I naturally couldn't refuse.
A four-year London period gave me the opportunity of working with
many other young linguists attached to the English Department: one
was Sidney Greenbaum, who later held academic posts in the United
States before returning to London as holder of the Quain Chair,
the oldest English language chair in Britain; another was Geoffrey
Leech, then a postgraduate somewhat unenthusiastically studying
the language of television commercials, who later left for Lancaster
University where he successfully contributed to making his department
one of the world-leading centres of stylistics, pragmatics and corpus
linguistics; a third was David Crystal, who later began a career
at the Universities of Bangor and Reading and, with his unique penchant
for both scholarship and popularization, ended up as an extremely
successful international speaker and writer on linguistic topics.
This was indeed a period when University College was rich in inspiring
linguists, and I was lucky to be in the right place at the right
time. Upstairs from the Survey, Michael Halliday had set up a linguistics
department with colleagues like Bob Dixon, Rodney Huddleston and
Dick Hudson, and next door was the Phonetics Department with A.
C. Gimson, J. D. ("Doc") O'Connor and John Wells. Just down the
road at the School of Oriental and African Languages, R. H. Robins
lectured on general linguistics. We could widen our horizons still
further by attending meetings of the London University Linguistics
Circle at Bedford College. I particularly remember a talk there
by John Lyons who, returning from a period at Indiana University,
declared he was "a transformational linguist" - a bold confession
to make to a largely Firth-inspired audience.
Our first computer project on the Survey was based on some of the
early collected spoken texts and written up in a paper submitted
to the Ninth International Congress of Linguists, in Cambridge,
Massachusetts (Quirk et al. 1964). It was a study of the correspondence
of prosodic to grammatical features, an exciting topic I was later
to return to in my work on spoken English, which I found strikingly
different from the written variety. This research also gave me the
first idea of how a computer's ability to store and manage information
could be put to good use for a linguist.
One day, I believe it was in 1963, Nelson Francis from Brown University
turned up at UCL, walked into the office and dumped one of those
huge computer tapes on Quirk's desk with the accompanying words
"Habeas corpus". This was the Brown Corpus which Nelson Francis
and Henry Kucera had just completed - the first large computerized
text collection of American English for linguistic analysis. Its
size of one million words and principles of text selection set a
pattern for the creation of later corpora. Over the years, the Brown
corpus has been used in English departments all over the world as
a unique source of data and a means of exploring the ways in which
computers can be employed in language research.
One of the milestones in the history of twentieth-century linguistics
was of course the publication in 1957 of Syntactic Structures
by Noam Chomsky, which imposed a constricting religiosity upon,
especially, American linguistics departments but set off, for decades,
the dominance of worldwide transformational generative (TG) linguistics.
His view of the inadequacy of corpora and the adequacy of intuition,
became the orthodoxy of succeeding generations of theoretical linguists:
Any natural corpus will be skewed. Some sentences won't occur
because they are obvious, others because they are false, still
others because they are impolite. The corpus, if natural, will
be so wildly skewed that the description would be no more than
a mere list. (Chomsky, 1962)
I have sometimes been asked why, in the unsupportive linguistic
environment at the time, I chose to become "a corpus linguist".
One reason for my choice was no doubt that, in the long Scandinavian
philological tradition in English studies, the text was central.
Another reason was of course that, to a non-native speaker of the
language, the armchair approach of introspection is effectively
ruled out. This may help to explain why certain parts outside the
Anglo-Saxon world, such as northern Europe, were early strongholds
of corpus linguistics. Yet the word "corpus" was not a common term
in the early days of the Survey of English Usage. In his plan for
the Survey, Randolph Quirk (1968) talks instead about "Descriptive
Register", "primary material" and "texts". I recall one discussion
over morning coffee in the UCL common room about the correct plural
of corpus: should it be corpuses or corpora?
The session reached an abrupt impasse when somebody suggested: "I
think it's corpi."
While we didn't buy Chomsky's ideas wholesale, they nevertheless
inspired us to undertake some related research. One basic concept
in his theory was grammaticality: "the fundamental aim" of a grammar,
he wrote, is to account for "all and only the grammatical sentences
of a language" (Chomsky, 1957). To us on the Survey, surrounded
as we were by masses of real language data, both spoken and written,
drawing the line between grammatical and ungrammatical sentences
seemed a huge problem. You will realize that our detailed analysis
of a corpus of real-life language was very much swimming against
the tide - there might indeed have been moments when you, being
named a corpus linguist, felt like discovering your name on the
passenger list for the Titanic.
The goal of the Survey of English Usage was to describe the grammatical
repertoire of adult educated native speakers of British English:
"their linguistic activity ranges from writing love letters or scientific
lectures to speaking upon a public rostrum or in the relaxed atmosphere
of a private dinner party. Since native speakers include lawyers,
journalists, gynaecologists, school teachers, engineers, and a host
of other specialists, it follows that [...] no grammarian can describe
adequately the grammatical and stylistic properties of the whole
repertoire from his own unsupplemented resources: 'introspection'
as the sole guiding star is clearly ruled out" (Svartvik & Quirk,
1980). Like the Brown corpus for American English, the Survey of
English Usage corpus for British English was to total one million
words collected from a planned programme of varieties but, unlike
Brown, it was to include both spoken and written material. There
was of course no question of attempting to match proportions with
the statistical distribution of the varieties according to normal
use: this would obviously have obliged us to assign over ninety-nine
per cent of the corpus to the preponderant variety of conversation
between people on an intimate footing. Instead, we saw the overwhelming
criterion as being the relative amount of data that would be required
to represent the grammatical/stylistic potential of a given variety.
The most difficult and time-consuming part of the work was the transcription
of the audio recordings, especially those of spontaneous, interactive
speech. Yet, any transcription, however "delicate" can be no more
than a rough representation of its original spoken performance.
Since we believe that prosody is part of grammar, the decision was
taken to include a transcription which was sensitive to a wide range
of prosodic and paralinguistic features in the spoken realization
as heard on the recording. This system was documented by David Crystal
and Randolph Quirk (1964) and further elaborated by Crystal in his
PhD thesis (1969).
It was of course never envisaged that any corpus, necessarily
finite (at least not in those pre-web days) would itself be adequate
for a comprehensive description of English grammar. From the outset,
elicitation tests with native subjects were envisaged as an essential
tool for enlarging on corpus-derived information and for investigating
features perhaps not found in the corpus at all. We undertook informant-based
studies trying to devise a technique for establishing degrees and
kinds of acceptability of different English sentences (Quirk &
Svartvik, 1966; Quirk & Greenbaum, 1970). We had found that
direct questioning (such as "Is this a grammatical sentence?") was
the least reliable technique. Our way of improving on the direct
question technique (called "judgement test") was to present informants
with sentences on which they were required to carry out one of several
operations (hence called "operation tests") which were easy to understand
and to perform. An example would be to turn the verb in the present
tense of a sentence into the past tense, and it would be left to
informants to make any consequential changes they then deemed necessary.
An example: When asked to turn the verb in They don't want some
cake into the past tense, 24 of the 76 informants replaced some
with any, and several others showed obvious discomfort
over some, with hesitations and deletions. The results indicated
that clear-cut categorization is futile and may actually inhibit
our understanding of the nature of linguistic acceptability. In
fact, the judgements of informant groups occurred anywhere throughout
the entire range between unanimous acceptance and unanimous rejection.
Testing acceptability in this way is not basic corpus linguistics
but rather an extension of it, so as to investigate not only linguistic
performance but also linguistic attitudes, and both techniques were
part of the original Survey plan.
Most of my work consisted of grammatical analysis of texts by
marking paper slips which were placed in various grammatical categories
and stored in filing cabinets. In those days computers were rare,
expensive, unreliable and not readily accessible to ordinary folk,
but located inside glass doors and operated by engineers in white
coats. In the company of Henry Carvell, a Cambridge mathematician
turned programmer on joining the Survey, I spent many late nights
in Gordon Square to get inexpensive off-peak access to the Atlas
machine, programmed by punched paper tape. When the tape broke,
which happened not infrequently, we had to start punching a new
tape all over again! The topic was the pursuit of suitable computational
methods of analyzing real-language data, where we used a program
primarily intended for the classification of bacteria - looking
in the rear mirror, a pretty bold under-taking, considering the
negative attitude to taxonomy in the dominant linguistic climate
at the time. We found the concept of gradience to be true in the
data: the results gave us a few fairly distinct classes with some
partially overlapping classes (Carvell & Svartvik, 1969).
Also the topic of my (1966) PhD thesis, On Voice in the English
Verb, can be said to have been inspired by TG. In his early
theory, Chomsky derived passive sentences from kernel active sentences,
claiming that every active sentence with a transitive verb can be
transformed into a corresponding passive sentence. The idea of representing
the active-passive relation in terms of transformations was not
new - Jespersen talked about the "turning" and Poutsma of the "conversion"
of the verb form from one voice to another, but it was only in TG
theory that the use of transformation was extended, formalized and
systematically incorporated into a unified grammatical framework.
The validity of this huge claim for active-passive transformation
seemed worth investigating. After all, a pair of active-passive
sentences like We play football and Football is played
by us is enough to make any native speaker dubious. Also considerations
other than linguistic may influence informant judgements: in one
informant test, English students were asked to mark in an answer
sheet the acceptability of, among others, the pair I have a black
Bentley and A black Bentley is had by me. One student,
in addition to rejecting the passive submitted to him by a threadbare
postgraduate student speaking in a sing-song accent, wrote this
marginal comment "And I don't think you can afford a black Bentley
in the first place" - which was of course a correct observation.
Over three hundred thousand words in some coexisting varieties of
present-day English, spoken and written, were subjected to a variety
of analyses which indicated that syntactic relationships can, and
should, be expected to be multidimensional rather than binary and,
in order to find this network of relations, it was best to cast
the net wide. The conclusions state that there is in fact "a passive
scale" with a number of passive clause classes that have different
affinities with each other and with actives, including both transformational
and serial relations.
A Grammar of Contemporary English (Quirk et al, 1972), was
written by a foursome (by one reviewer referred to as "the Gang
of Four"). When work on this grammar began, all four collaborators
were on the staff of the English Department, University College
London. This association happily survived a dispersal which put
considerable distances between us (at the extreme, the 5000 miles
between Wisconsin and Lund). In those days with no personal computers,
email and faxes (in fact, even electric typewriters were thin on
the ground), physical separation made collaboration arduous and
time-consuming. Still, the book appeared in 1972. The original plan
was to write a rather unpretentious undergraduate handbook but,
with ambitions soaring as the job got underway, the printed book
came to a total of 1120 pages. No wonder our Longman editor Peggy
Drinkwater, equally professional and patient, used to refer to the
ever-growing manuscript as "the pregnant brick". So the publishers
were keen to have two smaller, more marketable grammars. As a result
the foursome split up into twosomes with the idea of writing grammars
also for language learners (Greenbaum & Quirk, 1973; Leech &
Svartvik, 1975).
Our big grammar was successfully received but, in the early eighties,
we felt it was time to embark on an updated edition: this culmination
of our joint work resulted in a grammar that is considerably larger
and richer, A Comprehensive Grammar of the English Language
(Quirk et al, 1985). Contacts with international scholars have always
been important to us, and in the preparation of this book we enjoyed
welcome expert advice from, among others, two prominent American
linguists: Dwight Bolinger and John Algeo. Apart from good reviews
from colleagues, the grammar earned the distinction of being awarded
"First Prize in the English-Speaking Union's Duke of Edinburgh English
Language Competition", which all four of us received at Buckingham
Palace from the hands of HRH Prince Philip. An amusing episode occurred
after the photo session at the Palace. A couple of weeks later,
the publishers wrote to say that no picture emerged since the photographer
had somehow managed to insert the film incorrectly into the camera.
However, Prince Philip, finding a free spot on a December morning
in his diary, had kindly agreed to pose with the Gang of Four for
a second photo op so, Longman asked, "Could I manage to fly to London
on that date?" Yes, I could. What our host said to the photographer
is not fit to print in the European English Messenger.
The grammarian is beset with a number of problems. One is the
question of descriptive adequacy, as indicated in the openings lines
of a review which appeared in The Times:
Writing a grammar of a living language is as muddy an undertaking
as mapping a river. By the time you have finished, the rain has
fallen, the water has moved on, the banks have crumbled, the silt
has risen. With English having become the world language, in silt
and spate with hundreds of different grammars, the project of making
a comprehensive grammar of it is as Quixotic as trying to chart
the Atlantic precisely.
It's typical for some reviewers to focus on the changing language
rather than general descriptive problems. It's not the grammar of
English that has changed a lot - for instance, pronouns have largely
stayed the same for over four hundred years. The grammarian's real
problem is rather choosing adequate descriptive categories, finding
data and presenting them in an appropriate form of organization.
Another problem for the grammarian is of course to find an audience
for the book. The Times reviewer, Philip Howard (1985), concludes
by saying:
It is a prodigious undertaking. It is just a bit difficult to
see who it is for, other than other writers of English grammars.
You would be ill-advised to try to learn English from it.
Here Philip Howard was right. A pedagogical grammar has to be different
from a descriptive grammar.
Public attitudes to grammar are interesting and unpredictable.
One particularly spontaneous and illuminating reaction to grammar
from a native speaker occurred when I was drafting my part of the
Communicative Grammar. Being a keen sailor and boat lover,
I found myself sitting on the deck of the Queen Elizabeth en
route from Southampton to New York. Difficult as it was in the rough
weather, I was trying to keep my yellow notepad sheets from falling
overboard. An American lady sitting in a deck chair next to me,
clutching a highball (as those tall drinks used to be called in
those days), said:
"May I ask what are you writing, I'm just dying to find out?"
"A grammar" I said.
No other reply could probably have startled my companion as much.
After taking a big gulp from her highball she asked:
"A grammar - you mean to say you are a grammarian?"
"Yes, ma'am" I truthfully replied, realizing that, at this stage
in our encounter, it was too late to cover up what was after all
my academic and economic lifeline. I can still remember the exact
words of her succinct comment, drowning both the din of the engine
and the roar of the Atlantic:
"Gee, you've made my day, I've met a grammarian!"
Even now, thirty years after the event, I'm still not sure how
to interpret this reaction, but I fear it was not meant to be a
particularly flattering remark. I suspect that, to my deck chair
companion, "grammar" was a manual giving advice on how to avoid
"bad language": not mixing up imply and infer, disinterested
and uninterested, who and whom, and of course
how to avoid, at all cost, the imminent dangers of the passive voice,
the dangling participle and the split infinitive.
This little story reflects, I think, the discrepancy between the
popular and scholarly notions of what "grammar" is, or should be,
and the widespread distrust of professional statements based on
documented language use. While there is considerable public interest
in questions of usage, it seems hard work for linguists to convince
the public of the validity of their advice, even when supported
by actual usage, and to bring home the notion that grammar is not
synonymous with "linguistic etiquette". Also, many of the practitioners
who give advice on usage lack both linguistic competence and real
data, as Dwight Bolinger (1980) has pointed out:
In language there are no licensed practitioners, but the woods
are full of midwives, herbalists, colonic irrigationists, bonesetters,
and general-purpose witch doctors, some abysmally ignorant, others
with a rich fund of practical knowledge ... They require our attention
not only because they fill a lack but because they are almost
the only people who make the news when language begins to cause
trouble and someone must answer the cry for help. Sometimes their
advice is sound. Sometimes it is worthless, but still it is sought
because no one knows where else to turn.
In the mid-seventies, the London-Lund Corpus project was launched,
with the aim of making the spoken part of the Survey of English
Usage corpus available in electronic form. Thanks to a generous
grant from the Bank of Sweden Tercentenary Foundation it was possible
to employ a group of postgraduate students and get secretarial help
to transfer the spoken material typed on paper slips in London to
electronic form in Lund - a vast under-taking, considering the size
of the material and the problem of finding ways of representing
the detailed prosodic transcription in digital form. In 1980 we
published a printed book, A Corpus of English Conversation,
including thirty-four spoken texts from the magnetic tape (Svartvik
& Quirk, 1980). Later, the corpus became available on CD-ROM - still one of the largest and most widely used corpora of spoken
English, not least because it's prosodically annotated. The detailed
annotation has facilitated numerous studies of lexis, grammar and,
especially, the unique structure of spoken discourse. Under the
present director of the Survey of English Usage, Bas Aarts, the
corpus has recently been enhanced by the addition of wordclass tags
using the ICE-GB scheme. In addition, the Survey has plans to digitize
the original sound recordings to be supplied as a new resource.
Backtracking to the mid-seventies when the London-Lund project
was launched, I had three main reasons for opting for research in
spoken language. First, the Brown Corpus was a resource of machine-readable
text exclusively for the medium of written language, and this was
also the case with the on-going Lancaster/Oslo-Bergen Corpus (LOB)
project, which was the first of the Brown Corpus clones and designed
to be a British counterpart of the American corpus. Furthermore,
then available grammatical descriptions of English were almost exclusively
based on written language. Yet the vast majority of English language
use takes place in the spoken channel. Second, it seemed a pity
that the unique prosodic transcriptions of the Survey of English
Usage should be restricted to the small number of research scholars
who had physical access to the filing cabinets at University College
London. Third, in the 1970s, computers were becoming more widespread
and efficient, opening up new exciting approaches to corpus-driven
research in spoken English. Today anybody anywhere in the world
with a laptop, a CD unit and some off-the-shelf software can study
selected aspects of spoken English. The final product, the London-Lund
corpus, offered at cost to all interested colleagues in all parts
of the world, was the result of research, recording, analysis and
compilation extending over many years and involving a great number
of colleagues on the Survey of English Usage at University College
London and on the Survey of Spoken English at Lund University. But
for the dedication and arduous teamwork by our students there would
be no London-Lund Corpus. Many Lund colleagues contributed to the
corpus and made extensive use of it - including Karin Aijmer (now
professor at Göteborg University), Bengt Altenberg (later professor
at Lund), Anna Brita Stenström (later professor at Bergen University),
and Gunnel Tottie (later professor at the University of Zurich).
References
Dwight Bolinger, Language - The Loaded Weapon, The Use & Abuse
of Language Today. London: Longman (1980).
Henry T. Carvell & Jan Svartvik, Computational Experiments in
Grammatical Classification. Janua Linguarum, Series Minor 63.
The Hague: Mouton (1969).
Noam Chomsky, paper given at Third Texas Conference on
Problems of Linguistic Analysis in English, 1958, p. 159. Austin:
University of Texas (1962).
Noam Chomsky, Syntactic Structures. The Hague: Mouton (1957).
David Crystal, Prosodic Systems and Intonation in English.
Cambridge: Cambridge University Press (1969).
David Crystal & Randolph Quirk, Systems of Prosodic and Paralinguistic
Features in English. The Hague: Mouton (1964)
Sidney Greenbaum & Randolph Quirk, A Student's Grammar of the
English Language. Longman (1973); in the US: A Concise Grammar
of Contemporary English. New York: Harcourt Brace Jovanovich.
Philip Howard, Review of CGEL in The Times (23 May 1985).
Geoffrey Leech & Jan Svartvik, A Communicative Grammar of English.
London: Longman (1975).
Randolph Quirk, "Relative clauses in educated spoken English",
English Studies 38, pp. 97-109 (1957).
Randolph Quirk, "Towards a description of English usage". Transactions
of the Philological Society, pp. 40-61 (1960); reprinted
as "The survey of English usage", pp. 70-87 in Essays on the
English Language, Medieval and Modern. London: Longman (1968).
Randolph Quirk, A. P. Duckworth, Jan P. L. Rusiecki, Jan Svartvik
& A. J. T. Colin, "Studies in the correspondence of prosodic to
grammatical features in English", Proceedings of the Ninth International
Congress of Linguists, pp. 79-691. The Hague: Mouton (1964).
Randolph Quirk & Sidney Greenbaum, Elicitation Experiments
in English: Linguistic Studies in Use and Attitude. London:
Longman (1970).
Randolph Quirk, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik,
A Grammar of Contemporary English. London: Longman (1972).
Randolph Quirk, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik,
A Comprehensive Grammar of the English Language. London: Longman
(1985).
Randolph Quirk & Jan Svartvik, Investigating Linguistic
Acceptability. Janua Linguarum, Series Minor 54. The Hague:
Mouton (1966)
Jan Svartvik, On Voice in the English Verb, Janua Linguarum,
Series Practica 63. The Hague: Mouton (1966).
Jan Svartvik & Randolph Quirk, A Corpus of English Conversation,
Lund Studies in English 56. Lund: Lund University Press (1980).
©Jan Svartvik. Edited extract from 'A Life in Linguistics' which
first appeared in The European English Messenger, volume
14.1, 2005, 34-44. Reprinted by kind permission of the author.
This page last modified
3 October, 2008
by Survey Web Administrator.
|
 |
|