Book a place
The Summer School in English Corpus Linguistics is a three-day online introduction to corpus linguistics.
You'll gain experience with a state-of-art corpus and an understanding of basic statistical ideas.
It's aimed at students of language and linguistics and teachers of English. You'll need a basic knowledge of English linguistics and grammar.
The course will be taught on Zoom over 3 half-days (9am to 1pm UK summer time, GMT+1), to make it accessible to students from across Asian time zones who might wish to attend.
This course is taught by staff at the Survey of English Usage at UCL.
Over the three days, you'll learn about:
- the scope of corpus linguistics, and how we can use it to study the English language
- key issues in corpus linguistics methodology
- how to use corpora to analyse issues in syntax and semantics
- basic elements of statistics
- how to navigate large and small corpora, particularly ICE-GB and DCPSE
Who this course is for
The summer school is for students and teachers of the English language in colleges and universities who want to acquire a knowledge of basic concepts and methodologies used in English corpus linguistics.
You'll be expected to have a basic knowledge of English linguistics and grammar at undergraduate level.
Before the course begins, you'll be given access to a reading list and set of materials on UCL's Moodle site.
Structure and teaching
The course will be taught on Zoom, from 9am to 1pm (UK summer time, GMT+1), over 3 consecutive days.
There will be a theory and practical lecture each day. The theory session on the first day is on English grammar, the second session is on corpus linguistics methodologies, and the third introduces the basic principles of statistics.
The course is practical and hands on.
A certificate of attendance will be issued on request.
The corpora studied
You'll also learn about a wide variety of corpora. Most of the practical teaching focuses on two particular corpora, both developed at UCL. These are the British component of the International Corpus of English (ICE-GB), and the Diachronic Corpus of Present-day Spoken English (DCPSE).
These corpora consist of authentic samples of written and spoken English and are unusual in that they are fully parsed, i.e. they contain a complete grammatical tree analysis for every sentence. You'll use the state-of-the-art software developed for research with grammatical treebanks - ICECUP - to explore these rich resources.
You'll be taught statistics fundamentals from the ground up, from probability theory to distributions, confidence intervals and statistical tests. No prior knowledge of statistics is assumed.
At the end of the course, you'll have:
- acquired a basic but solid knowledge of the terminology, concepts and methodologies used in English corpus linguistics
- had practical experience working with two state-of-the-art corpora and a corpus exploration tool (ICECUP)
- gained an understanding of the breadth of corpus linguistics and the potential application for projects
- have learned about the fundamental concepts of inferential statistics and their practical application to corpus linguistics
Benefits of taking this course
After attending the course you'll be able to:
- confidently use the ICE-GB and DCPSE corpora in your research
- apply basic statistical procedures and, most importantly, understand the results
- understand the core concepts in analysis which have general purposes beyond the study of language - these include research methods and principles of statistical inference
As a teacher, you'll be able to design a course in English corpus linguistics for use in your own institution. You'll gain an understanding of the broad range of possibilities that corpus linguistics has to offer.
Costs and concessions
The standard fee is £150, or £125 for bookings made by 14 May 2023.
Full programme information
View the full programme for more detailed information.
Professor Bas Aarts
Bas teaches English linguistics to undergraduate and postgraduate students at UCL. Since January 1997 he's been the Director of the Survey of English Usage (SEU) at UCL - an internationally recognised and highly regarded centre of excellence for research in the area of English Language and Linguistics. From this research he and his team have developed 'Englicious', an extensive online platform containing original English language teaching resources closely tailored to the New 2014 UK National Curriculum, which includes professional development materials for teachers.
Sean is a Principal Research Fellow in corpus linguistics at the Survey of English Usage at UCL. He's the developer of the ICECUP research software, oversaw the completion of ICE-GB and DCPSE, and has written on many aspects of corpus linguistics methodology and statistics. He runs a blog on statistics, corp.ling.stats, which discusses how statistics should be used for research in corpus linguistics.
Beth is a Lecturer in English Linguistics, and was an editor of the recent Routledge volume Introducing Linguistics. Beth’s principal areas of research interest are sociolinguistics, including sociohistorical linguistics, discourse, and corpus methodologies. She is interested in synergising quantitative and qualitative corpus methodologies, both in terms of researching discursive constructions using corpora and in exploring language variation and change using corpus approaches.
"I am amazed by your availability and helpfulness throughout the summer school. 4 tutors for 21 students - that is a support ratio I have never had in my (university) education before. Maybe I should go to more summer schools."
"[Deciding what was most useful for me was a] difficult choice because each session offered a link to the next (admirably cohesive) and all were very useful - perhaps I'll opt for what I was least comfortable with, statistics."
"I'm studying statistics in my university. So I think this lecture is very useful and I want to study hard."
Book a place
Course information last modified: 3 Apr 2023, 11:48