The Summer School in English Corpus Linguistics is a three-day introduction to corpus linguistics.
You'll gain experience with a state-of-art corpus and an understanding of basic statistical ideas.
It's aimed at students of language and linguistics and teachers of English.
You'll need a basic knowledge of English linguistics and
This course is taught by staff at the Survey of English Usage at UCL.
Over the three days, you'll learn about:
- the scope of corpus linguistics, and how we can use it to study the English language
- key issues in corpus linguistics methodology
- how to use corpora to analyse issues in syntax and semantics
- basic elements of statistics
- how to navigate large and small corpora, particularly ICE-GB and DCPSE
Who this course is for
The summer school is for students and teachers of the English language in colleges and universities who want to acquire a knowledge of basic concepts and methodologies used in English corpus linguistics.
You'll be expected to have a basic knowledge of English linguistics and grammar at undergraduate level.
Before the course begins, you'll be given access to a reading list and set of materials on UCL's Moodle site.
Structure and teaching
This is a three-day course in which:
- mornings consist of a 'theory lecture' and a 'practical lecture'
- afternoons consist of a practical session where you're able to make the most of what you've learned
The theory session on the first day is on English grammar, the second session is on corpus linguistics methodologies, and the third introduces the basic principles of statistics.
The course is practical and hands on.
A certificate of attendance will be issued on request.
The corpora studied
You'll also learn about a wide variety of corpora. Most of the practical teaching focuses on two particular corpora, both developed at UCL. These are the British component of the International Corpus of English (ICE-GB), and the Diachronic Corpus of Present-day Spoken English (DCPSE).
These corpora consist of authentic samples of written and spoken English and are unusual in that they are fully parsed, i.e. they contain a complete grammatical tree analysis for every sentence. You'll use the state-of-the-art software developed for research with grammatical treebanks - ICECUP - to explore these rich resources.
You'll be taught statistics fundamentals from the ground up, from probability theory to distributions, confidence intervals and statistical tests. No prior knowledge of statistics is assumed.
At the end of the course, you'll have:
- acquired a basic but solid knowledge of the terminology, concepts and methodologies used in English corpus linguistics
- had practical experience working with two state-of-the-art corpora and a corpus exploration tool (ICECUP)
- gained an understanding of the breadth of corpus linguistics and the potential application for projects
- have learned about the fundamental concepts of inferential statistics and their practical application to corpus linguistics
Benefits of taking this course
After attending the course you'll be able to:
- confidently use the ICE-GB and DCPSE corpora in your research
- apply basic statistical procedures and, most importantly, understand the results
- understand the core concepts in analysis which have general purposes beyond the study of language - these include research methods and principles of statistical inference
As a teacher, you'll be able to design a course in English corpus linguistics for use in your own institution. You'll gain an understanding of the broad range of possibilities that corpus linguistics has to offer.
You'll also have an opportunity to meet other students and teachers to discuss ideas and issues, as well as having the chance to stay in London and explore its museums and theatres.
Full programme information
View the full programme for more detailed information.
Sign up for short course announcements: Subscribe to the UCL Life Learning newsletter to receive news and updates on courses in your chosen area. (For updates on a specific course, contact the administrator - see 'Contact information'.)
Professor Bas Aarts
Bas teaches English linguistics to undergraduate and postgraduate students at UCL. Since January 1997 he's been the Director of the Survey of English Usage (SEU) at UCL - an internationally recognised and highly regarded centre of excellence for research in the area of English Language and Linguistics. From this research he and his team have developed 'Englicious', an extensive online platform containing original English language teaching resources closely tailored to the New 2014 UK National Curriculum, which includes professional development materials for teachers.
Sean is a Senior Research Fellow in corpus linguistics at the Survey of English Usage at UCL. He's the developer of the ICECUP research software, oversaw the completion of ICE-GB and DCPSE, and has written on many aspects of corpus linguistics methodology and statistics. He runs a blog on statistics, corp.ling.stats (http://corplingstats.wordpress.com), which discusses how statistics should be used for research in corpus linguistics.
Rachele de Felice
Rachele is a Senior Teaching Fellow in English linguistics in the Department of English Language and Literature at UCL. Her research focuses on corpus pragmatics, which looks at how the use of corpora can further our understanding of pragmatics and communication.
Previous participants have said:
“I am amazed by your availability and helpfulness throughout the summer school. 4 tutors for 21 students - that is a support ratio I have never had in my (university) education before. Maybe I should go to more summer schools.”
“[Deciding what was most useful for me was a] difficult choice because each session offered a link to the next (admirably cohesive) and all were very useful - perhaps I'll opt for what I was least comfortable with, statistics.”
“I'm studying statistics in my university. So I think this lecture is very useful and I want to study hard.”