Principles of Natural Language, Logic and Statistics

We conduct research on mathematical models of natural language that take both logical structure and statistical data into account. The models are applied to textual understanding in a range of domains from human parsing processes to media recommendations.

Image of two trees which resemble faces with birds flying towards each other representing an abstract image of language

Research activities

Like nature, natural language abides by a set of logical rules and its behaviour can be described by statistics. We work on unified models that bring the two together. Our aim is to lead the development of next generation Artificial Intelligence systems that are compositional and transparent.

Historically, logic and statistics have been studied by different areas of mathematics, known as pure and applied. The logical models we consider lend themselves to statistical representations. Predominant examples are substructural logics and their computational variants, such as Lambek Calculus, modal Lambek Calculus and the CCG. Higher order sheaf theoretic semantics of these are also considered.

The statistical models we consider use vectors, matrices and tensors as learning vessels. Matrices and tensors are natural inhabitants of quantum mechanics. The parameters of our models can thus also be learnt via quantum simulations and on quantum computing devices.

We model a variety of real-world problems, from lexical semantics, to paraphrasing and inference, to human parsing processes and textual understanding. Some of the challenges we have considered are Garden path phenomena from human parsing, the Winograd Schema Challenge as a test of machine intelligence, and media recommendations.

Principles of Natural Language, Logic and Statistics Lab is by a Research Chair of Royal Academy of Engineering, on “Engineered mathematics for modelling typed structures”, awarded to Professor Mehrnoosh Sadrzadeh, March 2022

People and Projects

Compositional distributional audio representations, PhD student: Saba Nazir
Multi modal similarity-based media recommendations, PhD students: Taner Cagali, Saba Nazir
Contextuality and sheaf theory in quantum mechanics and in natural language, PhD students: Lo Ian Kin and Daphne Wang
Large scale quantum variational learning for pronoun resolution, PhD students: Hadi Wazni
Vector space semantics for modal Lambek Calculus and applications to discourse, PhD students: Lachlan McPheat

Collaborations

Quantum natural language processing Hadi Wazni with Quantinuum
Quantum contextuality in natural language data with Quandela
Similarity-based recommendations with the BBC
Health applications with Cavenwell Industrial AI
Modal substructural logics and applications, Horizon 2020 MOSAIC-RISE with Michael Moortgat (Utrech) and Gijs Wijnholds (Leiden)
Linguistic Matrix Theory, with Sanjaye Ramgoolam (QMUL)

Events

Weekly Seminars, since November 2022
WiL Workshop@CADE, Rome, July 2023
DCM Workshop@CADE, Rome, July 2023
QPL Conference, Paris, July 2023
MOSAIC Workshop, Vienna, September 2023
Pros&Comps Workshop@ESSLLI, Ljubljana, August 2023
AMSLO Workshop@ESSLLI, Ljubljana, August 2023
Natural Language Syntax & Statistical Semantics@ESSLLI Course, Ljubljana July 2023
Hopper 2023, London, May 2023
QNLP 2023, Gothenburg, May 2023

Projects for prospective MRes and PhD students

Logic

Complete vector space and probabilistic semantics for:
- Lambek Calculus
- Modal Lambek Calculus
- Fuzzy logic
- Separation logic
Weighted relational semantics for modal Lambek Calculus
Relational algebras for modal Lambek Calculus

Quantum

Circuit based text learning
Grover’s algorithm for recommendation generation
Encoding classical word embeddings into quantum circuits
Non local games and linguistic protocols
Quantum contextuality in large language models
Classical vs quantum compositional distributional semantics (DisCoCat)

Events and Sheaves

Sheaf theoretic models of natural language

Lexical ambiguities, polysemy/homonymy
Discourse ambiguities
Grammatical ambiguities
- Local ambiguities in garden path sentences
- Global spurious ambiguities
Combinations of the above