Statistical Science


Episode 4 Transcript

Brieuc Lehmann  0:11  
You're listening to the sample space podcast in the department of statistical science at University College London. My name is Brieuc Lehmann. And today it's my great pleasure to be talking to Professor Chris Holmes from the University of Oxford. Welcome, Chris.

Chris Holmes  0:23  
Hi. Pleasure to be here

Brieuc Lehmann  0:26  
So full disclosure for the listeners. Before I joined the department here at the start of the year, Chris was, in fact, my boss at Oxford. And actually, I still have the pleasure of working with Chris as part of a COVID statistical modelling lab that Chris adds up, which I'm hoping we'll have the time to talk about later. Despite this conflict of interest, however, I'll do I'll do my best to grill Chris with some hard hitting questions. The main reason we're here today, though, is to talk about the fantastic seminar that Chris gave to the department about a month ago on Bayesian predictive inference. If you missed it, you can find a recording of the seminar on our YouTube channel. Before we get going on that, though, I want to address an important issue. If you don't know what Chris Holmes looks like, and you search for him on Wikipedia, you'll be confronted with a heavily tattooed gentleman holding an electric guitar. And generally looking pretty gnarly. So Chris, do you in fact, moonlight as the lead guitarist of the heavy metal band? WASC?

Chris Holmes  1:25  
No, unfortunately, not. Might be more excited.

Brieuc Lehmann  1:30  
About that's a shame. So you're in fact, the professor of bias. That's Oxford, and you also lead the health programme at the Alan Turing Institute, and the list goes on. But if you hadn't been a statistician, what would you have been?

Chris Holmes  1:47  
Yeah, that's a good question. I don't quite know. I mean, outside of academia. I'd say I did something in creative industries. That's what I'd like to think I would do, maybe, maybe, you know, a heavy metal rock band, you'd have to have a pseudonym. So you didn't ask me the other Chris Holmes? Actually,maybe a teacher actually.In the public sector. Definitely. I think that would be something

Brieuc Lehmann  2:17  
sounds good. I mean, but you are, in fact, a statistician? And could you tell us a bit about your kind of academic journey to where you are today?

Chris Holmes  2:26  
Yeah. So I guess, of kind of interest is I was working in industry, after I did an MSc in artificial intelligence back in the kind of the first wave or the second wave of when it was super exciting. And there were expert systems and, you know, lots of grand challenges. And so I did a I did a master's in both natural and artificial intelligence. And that just got me super interested in this idea of kind of learning from data. And then I worked in industry, in scientific computing, in a small software house, just outside Bristol that wrote the control systems for the satellite for the European Space Programme. And they also wrote the SCADA supervisory control and data acquisition systems for major utilities and, and I was working on kind of updating algorithms, you know, in terms of Nowcasting and, and that just got me interested in Bayesian inference, actually, because it's a very natural framework to update as new information becomes available. And then from that, I then just got more and more interested in the topic. So I went to do my PhD at Imperial College London. I think I was 28 Or maybe 29. When I started my PhD, and I just loved it, I never looked back.

Brieuc Lehmann  3:54  
Who was that with?

Chris Holmes  3:55  
So with with Barney Malik, who was my first supervisor, but he left after that, hopefully not due to me but he left kind of nine months, or six, seven months into my my PhD, but I was pretty self sufficient. Being a the age I was and also the fact that I'd been learning, you know, working in industry. So we're kind of charted my my course, from there, but it was a fantastic environment at that time. There was people like Adrian Smith was in the group and Dave Stevens, who's now at McGill, Steven Walker, and people and I've kept in touch with them. And Adrian is now he was my, he went on to be my postdoctoral supervisor. Because after finishing my PhD at Imperial, I stayed on as a postdoc, and Adrian was my supervisor, and he's now my boss at the Alan Turing Institute, again, so things go around. So I stayed at Imperial, did my lectureship and then moved to Oxford 17 years ago. It was Peter Donnelly really. Kind of instigated that move.

Brieuc Lehmann  5:05  
Did you know that? Adrian Smith is also the lead guitarist of Iron Maiden?

Chris Holmes  5:11  
There's definitely I can see there's a theme. What are the chances of that?

Brieuc Lehmann  5:17  
I googled it earlier there. There's a I Googled Adrian Smith, Chris Holmes. And there's actually a picture of the two guitarists that we recreate with the statisticians. Adrian second, Chris Holmes. So you mentioned you mentioned Steven Walker, who's one of the co authors on the Martingale posterior paper, this paper behind the seminar that you gave it at UCL about a month ago. Could you talk us through the main ideas around the paper kind of how it came about?

Chris Holmes  5:50  
Yeah, sure. So yeah, I've been a long term collaborator and friend, with Stephen. So the background to the Martingale, posterior paper and predictive inference was, we've been working Stephen and myself and students have been working on this idea of general generalised Bayesian updating, which allows one to replace the likelihood function with a kind of a loss function, targeting a particular parameter of interest. And through that we got very interested in the Bayesian Bootstrap. So originally is a way to just kind of calibrate general Bayesian updating, because the problem as soon as you if you if you move away from a likelihood function, you don't have you don't know, you don't know how much to update. I've got this new piece of information. It's a datum. And what Bayes theorem tells you in which is beautiful, is exactly how much information is contained in this new datum, relative to what you've got. And that's the coherency property of having a joint probabilistic structure. Now, if you forsake that, I get a new datum. You know, how much information does that new datum carry relative to what I already know. And you don't have that as if you kind of go outside of the kind of full probability modelling. And so we were thinking carefully about how to do calibration. And the Bayesian Bootstrap. had this wonderful kind of was a wonderful tool for us. Because it allowed us to do these kinds of randomised loss functions. We were exploring actually hierarchical Bayesian bootstraps, at the time. And we were struggling a bit, you know, with this randomised weight viewpoint, which is I think most people see the Bayesian bootstrap as a randomised weight. And it was really when we sat back a little bit a little bit and looked at the polyrattan representation of the Bayesian Bootstrap. And it's unfortunate that that's not more better known because suddenly this thing which looks rather odd about these randomised weights is directly uniform directly kind of weights becomes really intuitive, which is the basis bootstrap as a Polya urn reinforced Polya urn representation, which is you start off with the empirical CDF, Miracle cumulative distribution function, and you just predict a new point, and you just add it back into the pot to update and then you predict another point and you add it back into the pot or the urn to update once and in the limit. You get back the randomised weights, once we were there, we were almost to the Martingale posterior, which was Hold on a second. This is kind of interesting Bayesian approach, which starts off with a predictive, the empirical CDF and just updates. And you run that through to a limit. And you get something really interesting, because in the limit, once I've got near infinite number of datums, or data that have been drawn from the earth, you can calculate any statistic. And that's the kind of generality of the bootstrap, you're just any target of interest. You can just bootstrap it, you don't need a full probabilistic model. And I guess the breakthrough came as well from Stephen, who noted this connection between do a paper on the theory of the application of Martingales and a conventional Bayesian approach, which says, you know, what Doob shows is this thing that we were looking at the Bootstrap, which is you start off with a predictive model, given you a data, you kind of make a prediction, and then you put that back into the pot, and you update and you put that back into the pot has a precise Bayesian representation in conventional inference, and then we were away.

Brieuc Lehmann  9:45  
It's a fascinating background, right, because it draws it draws from a variety of fundamental papers over the course of like, you know, the last kind of almost 100 years, I guess, going back to kind of, like you say do but I think there's something important ideas from the De Finetti in there as well. And so it's interesting how it kind of brings together, these different ideas that played kind of independently have played very important roles in kind of Bayesian statistics. And this is a nice way that really kind of brings them together. What do you see are the kind of challenges ahead? Are there any kind of, you know, outstanding problems around the Martingale posterior paper that you'd kind of keen to explore a bit further?

Chris Holmes  10:26  
Yeah, I mean, at the heart of that paper is this idea that you forsake the kind of the safety of Bayesian inference, which gives you, you know, a lot of inbuilt coherency. So as soon as you go down, if you kind of set out a Bayesian model with a likelihood and a prior, you're really protected at that point, because you know, that you're going to be coherent, you know, that the update is going to be valid, and your predictive inferences are internally coherent. When you step out of that, I mean, again, at the heart of the Martingale, posterior is, well, there's a couple of things which are unusual from Bayesians. First of all, you build your model, given all available information, at the time of modelling, and so that includes the data that you've got, which means you don't start off with a prior, and then you're locked into that model forever. Now, of course, you know, basins are quite rightly. And I would argue that, you know, you go through careful model criticism, but that kind of for the Martingale, posterior predictive inference, you start from the current data point, and you just need to build a predictive and an update. And then keep running that forward. So you keep simulating and then conditioning on the output of the simulation feedback. So it's this recursive feedback into the model. What that means is that there's two things. First of all, you need to ensure that if I build If I adopt a predictive model, that you don't somehow add information artificially as you go forward. And that's the Martingale condition, essentially, it says that, you shouldn't be able to kind of create information out of nothing. So that's the Martingale condition. So checking that you've got a Martingale condition on your predictive is important. And that has to be done at the moment on a case by case basis. The second thing is this, how much information is in the update this problem that I I've kind of alluded to that we were tackling right at the beginning, which is if I've got a predictive model, and you don't, you know, you want to kind of go outside of Bayes because that's why you might want to use this approach. You know, if I give you a new datum, how should that adjust your beliefs? And so that's, again, a kind of, I'd say, a generic, like, a generic challenge 

Brieuc Lehmann  12:48  
Around how to kind of calibrate the updates.

Chris Holmes  12:52  
Yeah, so Exactly. So how much of this new data that you've simulated from the model? How much did that move you or change your change your beliefs?

Brieuc Lehmann  13:01  
It's a challenging problem. And I know lots of people are thinking about that, at least there have been lots of people thinking about that in terms of the generalised Bayesian updating, right? In terms of how to. I guess it's a similar problem for the matingale posterior. You kind of just touched on it. But there's, there's an interesting distinction between have statistical inference based on predictive distributions as in the material posterior, as opposed to I guess, the more kind of traditional approach, which is to focus on like the data generating process, I suppose. I mean, you kind of already touched on it. But I wonder if if you see that there's any particularly you know, like applications or fields where you feel like one is more appropriate than the other.

Chris Holmes  13:42  
I mean, that again, the kind of wonderful thing with Bayesian inference is it separates both a strength and a weakness, that it separates out the modelling part from the decision analysis. And so what that means is that if I'm taking a fully Bayesian approach, I build my probabilistic model. And then formally, what what questions you want to ask if that model, immaterial, any question can be asked of that model? And so you just model nature's generating process? and off you go, that's great. Because it means that you can ask multiple questions of models, and you can ask repeated questions at the models and you kind of retain coherency because you've got this joint probabilistic structure. However, the concepts against that, or, or some of the kinds of issues with that, which people have highlighted before people like that, Nick, in statistical learning theory says like Bayesian solver, more general problem, you know, if I'm interested in the median, say, of a population, non Bayesian, you can just target that I just write down a loss function or an estimator estimating equation. And I solved the problem that's been taught that I've been posed with, whereas the Bayesian's have to solve a much more general problem we need a full generative process. Yeah, for the data and from the outside you say, Well hold on. What why, you know why kind of solving a more general problem? All I want to know is what's the what's the estimate of the median? So now the predictive approach is quite nice, because the starting point from the prediction is given everything we know now, what's the missing information that you need to answer the question? And so again, it takes a targeted approach. And if you say, to me, Look, parts of the data, parts of the aspect of the data are not of interest to answering the question, I just have to build a predictive model for the missing information needed. And then you effectively impute the missing information. So it's a much more kind of targeted approach to inference. And that's where I would say it's probably the big contrast, right? And weaknesses.

Brieuc Lehmann  15:53  
That's very interesting. I'm going to make a segue. Now it's a think just over two years to the day that the UK went into the first first lockdown. And I know because you've been, you've been kind of heavily involved with a number of initiatives supporting the the kind of government's response to COVID, through through statistical modelling. I was wondering if you could talk for a couple of minutes about the, I guess, the story of your pandemic, from an academic point of view, or not, not necessarily academic research point of view.

Chris Holmes  16:24  
It's been busy. That's my number one kind of emotional. Looking back, I remember at the start, you know, obviously, it was a terrible kind of situation. But I actually thought I'd probably get a lot of reading done, because, you know, things were kind of closing down. But you know, it turned to be far, far away from that. So wishful thinking, sorry, yeah, it was, it was very wishful thinking. So we got, we got involved early on into kind of two major applications through the Turing Institute. One was a project called to COVID, which was looking to pull together electronic health records from secondary care hospitals, to major hospital trusts, NHS, hospital trusts, integrate the data, using a common data dictionary, something called Oh mop, and then do almost real time kind of analytics, that was the goal at the time. And so that involves a huge amount of kind of setting up getting the information governance, right, the trusted research environment, etc, and all the data wrangling. So alongside that, we also had a request for NHS X, to come and help on the NHS app, which has been launched about this kind of digital contact tracing, and what were the kind of inference issues and so was running those two kinds of things, you know, in parallel. And we're very, in some sense with very different outcomes that the COVID project proved to be much more challenging than we expected, we had a target to bring data together to start being able to, to do statistics, statistical inference on that the target was about three months, I think it took us 11 months or so to actually integrate that data on the NHS X side that moved at speed. And we wrote the algorithms to in order to work out the distances. So effectively, we were both helping on the inference. What do you do when you get the data? And how should you set thresholds? But also, you know, how can you calculate the distance between two mobile phones that are pinging Bluetooth signal? So yeah, it was an interesting, it was an interesting time. But the issue at the beginning, and our there's a great piece article I read in the Atlantic, which is, you know, obviously a US newspaper on the fog of the pandemic, it was called, and it was about, you know, analogies with the fog of war, which is that, you know, the data was just, you know, was patchy and trying to get an understanding of what was happening on the ground was incredibly difficult. But, you know, we're now kind of got there in the end, but it did take time.

Brieuc Lehmann  19:12  
Yeah, I think it's been interesting. I guess, I've been working with you on some of these projects. And it's something which is struck me as the kind of the difference in pace of work between, I guess, doing statistical modelling to analyse COVID data, as opposed to kind of purely academic research. And it doesn't mean you don't, you'd have to, in a sense, you have to, it's not really cutting corners, but you have to make decisions based on I guess, less data that you would normally be happy to make decisions on. I think I think there's some, I guess, some interesting work coming from the cheering, RSS lab on interoperable interoperability and, and how we can kind of create these new model models at speed. It's interesting do you see these days is kind of becoming more mainstream? Or do you think there any kind of appropriate in the, in the context of a pandemic?

Chris Holmes  20:07  
Yeah, thank you. I think, you know, you're, you're absolutely right. And I take my hat off to, you know, the analysts who are really working at the front line in UK HSA, it was the JVC, joint biosecurity Centre at the time we're having to answer provide analysis to ministers with like, a turnaround of like a day or, or two days for really important decisions. And, and that's incredibly challenging, because you've got a deadline. And that's unusual for academics, though, to say normally, you know, our research just runs the pace that it runs. And then once it's, you know, packaged, and we're happy with it, and careful and checked it, we put it out, but, but here, they're asked a very fast turn around cheering in collaboration with the Royal Statistical Society, as you know, because you're a member of this is set up a research lab, embedded within UK HSA health security agency was it now is where we were on boarded onto their systems, which is great, because we've got all the data that they have, again, different to the kind of usual academic cadence of, you know, months, possibly years, you know, how can we turn things around in weeks, and try and bring some statistical rigour and innovation to the analysis. And it struck me, I mean, this is a well known phrase, but it was something that Peter Donnelly kind of used to say, as well about the great being the enemy of the good, which is, of course, there's a perfect statistical analysis more near perfect out there that you would like to do, you just simply don't have the time and you have diminishing returns because one striking feature of the pandemic was how quickly the questions changed. Which is how this notion of interoperability arose is that UK HSA, were asking us questions, and by the time we'd solved them, the questions have moved on. They were like, no, no, we don't need to know about that. What we really want to understand, is this new variant that sweeping through or, you know, what's the effect of schools opening up? And what we found in the lab is this idea that we were almost chasing our tails. So it gave us a chance to kind of step back a little bit and say, what are the principles here? And the principles are, you're asking lots of questions of the same process, there's a single process the pandemic, and there's lots of statistical questions being asked at that process. And, and so that got us thinking about how can we improve interoperability and the kind of recyclability? If that's a word of our models, you know, what are the common data formats that are going into these models, you know, and there was data coming out test and trace data come in, out of, you know, wastewater, and then of the hospital records. And so, let's try and standardise the data feeds, think about the core components of the statistical models that we're building, and try and build kind of recyclable components of analysis that can be integrated in a probabilistic framework. That was that was how that kind of notion of interoperability kind of came about. Nothing is really interesting and important work. And I think the there's a preprint on archive which will can stick in the in the podcast for people to have a look if they're interested in any more. I think we're out of time. Sadly, thank you so much for taking the time to chat today. Chris, it's been it's been honestly fascinating. I know. Your group has a brand spanking new website at WWW dot Chris Holmes. lab.com. There's just a plug for you there. Is that the best place to for people to kind of read more about your research appointments or anywhere else was that? Is that the right? Yeah, I guess I'm very I'm not very good at the kind of that aspect. Yeah, that will Google Scholar.

Brieuc Lehmann  23:49  
Okay. Okay. So we'll add that into the the meeting notes as well. Great. Okay. Well, we'll stop there. Thank you so much, Chris. And thank you to everyone for listening. See you next time.

Chris Holmes  24:00  
Thank you.

Unknown Speaker  24:02  
UCL minds brings together the knowledge, insights and ideas of our community through a wide range of events and activities that are open to everyone.