Principles of Natural Language, Logic and Statistics
Research activities
Like nature, natural language abides by a set of logical rules and its behaviour can be described by statistics. We work on unified models that bring the two together. Our aim is to lead the development of next generation Artificial Intelligence systems that are compositional and transparent.
Historically, logic and statistics have been studied by different areas of mathematics, known as pure and applied. The logical models we consider lend themselves to statistical representations. Predominant examples are substructural logics and their computational variants, such as Lambek Calculus, modal Lambek Calculus and the CCG. Higher order sheaf theoretic semantics of these are also considered.
The statistical models we consider use vectors, matrices and tensors as learning vessels. Matrices and tensors are natural inhabitants of quantum mechanics. The parameters of our models can thus also be learnt via quantum simulations and on quantum computing devices.
We model a variety of real-world problems, from lexical semantics, to paraphrasing and inference, to human parsing processes and textual understanding. Some of the challenges we have considered are Garden path phenomena from human parsing, the Winograd Schema Challenge as a test of machine intelligence, and media recommendations.
Principles of Natural Language, Logic and Statistics Lab is by a Research Chair of Royal Academy of Engineering, on “Engineered mathematics for modelling typed structures”, awarded to Professor Mehrnoosh Sadrzadeh, March 2022
People and Projects
- Compositional distributional audio representations, PhD student: Saba Nazir
- Multi modal similarity-based media recommendations, PhD students: Taner Cagali, Saba Nazir
- Contextuality and sheaf theory in quantum mechanics and in natural language, PhD students: Lo Ian Kin and Daphne Wang
- Large scale quantum variational learning for pronoun resolution, PhD students: Hadi Wazni
- Vector space semantics for modal Lambek Calculus and applications to discourse, PhD students: Lachlan McPheat
Collaborations
- Quantum natural language processing Hadi Wazni with Quantinuum
- Quantum contextuality in natural language data with Quandela
- Similarity-based recommendations with the BBC
- Health applications with Cavenwell Industrial AI
- Modal substructural logics and applications, Horizon 2020 MOSAIC-RISE with Michael Moortgat (Utrech) and Gijs Wijnholds (Leiden)
- Linguistic Matrix Theory, with Sanjaye Ramgoolam (QMUL)
Events
- Weekly Seminars, since November 2022
- WiL Workshop@CADE, Rome, July 2023
- DCM Workshop@CADE, Rome, July 2023
- QPL Conference, Paris, July 2023
- MOSAIC Workshop, Vienna, September 2023
- Pros&Comps Workshop@ESSLLI, Ljubljana, August 2023
- AMSLO Workshop@ESSLLI, Ljubljana, August 2023
- Natural Language Syntax & Statistical Semantics@ESSLLI Course, Ljubljana July 2023
- Hopper 2023, London, May 2023
- QNLP 2023, Gothenburg, May 2023
Projects for prospective MRes and PhD students
Logic
- Complete vector space and probabilistic semantics for:
- Lambek Calculus
- Modal Lambek Calculus
- Fuzzy logic
- Separation logic
- Weighted relational semantics for modal Lambek Calculus
- Relational algebras for modal Lambek Calculus
Quantum
- Circuit based text learning
- Grover’s algorithm for recommendation generation
- Encoding classical word embeddings into quantum circuits
- Non local games and linguistic protocols
- Quantum contextuality in large language models
- Classical vs quantum compositional distributional semantics (DisCoCat)
Events and Sheaves
Sheaf theoretic models of natural language
- Lexical ambiguities, polysemy/homonymy
- Discourse ambiguities
- Grammatical ambiguities
- Local ambiguities in garden path sentences
- Global spurious ambiguities
- Combinations of the above