30 May 2024, 1:00 pm–3:00 pm

Linguistics seminar

Entity tracking in Language Models - EVENT POSTPONED

Alina Konradt


4th floor Seminar Room
66-72 Gower Street
United Kingdom

Title: Entity tracking in Language Modelss

Abstract: Keeping track of how states and relations of entities change as a text or dialog unfolds is a key prerequisite to discourse understanding as well as other AI tasks such as planning, and yet it remains unclear to what extent pretrained language models systematically exhibit this capability. In my talk, I will first discuss the challenges that come with evaluating such general abilities in LMs, and then I will present a new evaluation task for assessing entity tracking abilities in LMs. I will then present results on GPT-3/3.5/4, Flan-T5, and Llama 2 models and discuss the influence of pretraining on code for entity tracking abilities. I will also show that smaller models can learn to track entities but their generalization abilities are still quite limited, and present some preliminary results from a mechanistic interpretability study on identifying the algorithm that the model implements for solving this task.

Sebastian Schuster

at UCL Linguistics

