UCL Centre for Artificial Intelligence


Episode 2: Lewis Griffin

Will Large Language Models (LLM) allow states to wage propaganda campaigns of unprecedented scale and persuasiveness? Or is this just another moral panic about new technology?

Listen to episode

MediaCentral Widget Placeholderhttps://mediacentral.ucl.ac.uk/Player/b2acGdB7

About the speaker

Lewis Griffin with a ginger beard and hair smiling at the camera

Lewis Griffin is Professor of Computer Vision at UCL.

He has recently found some evidence that humans and LLMs can be persuaded in similar ways, and warns that it may soon be possible for states to optimise the persuasiveness of propaganda by testing it on LLMs, just like we test drugs on mice.



Lewis Griffin  0:01  
It raises this concern that malicious people might use it to learn how to persuade us. But it also provides a potential kind of defence, I think of them as automated weather gauges. So the idea instead would be to say, could you have a large language model, which is consuming social media.

An? then it's being regularly probed for its beliefs. Going even further, you can imagine, rather than having a static media profile that just sits there inert reading, its wall, it's instead interacting with posts, liking them and viewing them things like that, giving something back to the the recommendation algorithm so that they, it drives what appears on their feed, we could actually have a cycle where its views are being changed by what it's reading, which is causing it to change what it's liked.

And you could imagine, maybe you could get it so you could watch one of these things being led down a conspiracy rabbit hole.

Reuben Adams  1:08  
Hello, and welcome Steering AI, where we talk about how AI will shape the future and how we can steer it along the way.

I'm your host, Reuben Adams, PhD student in AI at UCL. Today we're talking about language models, disinformation and political influence. Will LLMs allow states to wage propaganda campaigns of unprecedented scale and persuasiveness? Or is this just another moral panic about new technology?

Our guest is the discerning and laconic Lewis Griffin, Professor of computational vision at UCL. Lewis Griffin has recently found some evidence that humans and LLMs can be persuaded in similar ways. And warns that it may soon be possible for states to optimise the persuasiveness of propaganda by testing it on LLMs, just like we test drugs on mice, Professor Lewis Griffin, welcome to steering AI.

Lewis Griffin  2:00  

Reuben Adams  2:02  
Usually when we talk about large language models, and disinformation or propaganda, we're talking about the large language models writing the propaganda for other people to read. But you've lately been researching large language models reading the propaganda. What's the motivation behind that?

Lewis Griffin  2:16  
That's right, I'm turning it round. The motivation, let me tell you where the idea for looking into this came from. I was fretting about what may have happened in the Brexit referendum, the use of targeted possibly even hyper targeted advertising.

And then I was speculating about how AI could enhance operations like that. You know, the obvious route was about training LLMs to work out who to target and how. And it struck me that the kind of weak link in that if you wanted a completely closed system, writing this stuff was they still wouldn't know what to write.

And it would still take human authors to write, as they did, five candidate messages, and then work out some rules about what message to each other. So I thought the weak point is, thankfully, the weak point is that they, we don't know how to automate the writing.

And so it occurred to me that if you were trying to do this automatically, you would want to be able to test out what messages worked on different people. And actually sending messages to people and looking at clicks was relatively slow way to generate that feedback.

What you'd really like to do is have some kind of tool where you could try out a message and get some instant feedback. And so that then occurred to me as a hypothesis, could large language models play that role, where one tried out persuasive messages on them, measured how what how much a message worked, and then refined that message?

Not because I wanted to do it or not, because I thought I welcomed anyone else doing it. But it struck me that this is something that someone could do. So it was a possibility that needed to be checked to be ready for it.

Reuben Adams  4:26  
Right. So the idea is you might have multiple different excerpts that you've written like tweets or blog posts, or news articles that a large language model has written or human has written.

And then you could send these out into the world and gather data about how people respond to them, do they click on them? Do they share them? But this is time consuming.

And if you could instead have a large language model acting as a model of these humans, and send it out to those and get response back, the response will be almost immediate. And you could have this much tighter feedback loop.

Lewis Griffin  4:56  
That's right. You know, as long as you're relying on getting the feedback from people, you're going to be limited to 10s, hundreds of kind of points of feedback. Whereas if you could if big if obviously, but if you could put that process into silicon, then you could be looking for 1000s 10s of 1000s, hundreds of 1000s of points of feedback, and you could optimise it so much more.

Reuben Adams  5:25  
So different arguments might appeal more or less to different demographics or people. How would you work that into the optimization process?

Lewis Griffin  5:33  
So again, that's, you know, that's another question, isn't it? First question is, do they respond to attempts to influence at all? And then a second question is who are they kind of responding as, are they responding as a typical person or particular person?

And can we modulate who they respond as? One of the papers that made a big influence on me was what was then a preprint, called out of one many, by American political scientists.

And they already showed, I thought pretty convincingly, that within a single large language model, you could get it to adopt different demographic personas, and answer as if being different members of the population.

I didn't mean that that would also extend to being persuaded in different ways. But that was a it was an important sort of inspirations.

Reuben Adams  6:33  
So I think we're used to the idea now that language models can roleplay these different members of society. What is the evidence that they can actually be persuaded in the same ways as those different members of society?

Lewis Griffin  6:45  
Well, I mean, I'll start with the evidence that they can be persuaded like a person at all, and then perhaps we can think about whether they can react differently, depending on who they are. Right. So then, you know, I had to craft some experimental investigation of persuasion.

So I kind of mucked up on my persuasion theory a bit and identified two phenomena that are kind of points within the space of persuasion methods. So the first phenomena is what I would call a kind of rhetorical mode of persuasion, which is not some meaning by that it's sort of about presentation rather than the content.

And that phenomenon is the illusory truth effect. And the illusory truth effect is the phenomenon that if you experience a particular statement, read it heard it, even without that sort of really providing any evidence for its validity, that earlier exposure causes you later, to rate it as more truthful.

So this been studied for quite a long while there's also a big flurry of research on it in the last sort of decade, because certain politicians, Donald Trump, for instance, would appear to sort of use this strategy quite a lot, just repeating a lie, you know, and not that I'm particularly drawing a comparison, but it occurs in some of the profiles that were written about the approach Hitler used to propaganda that this was a mode of persuasion that he was familiar with.

Reuben Adams  8:32  
It's sometimes described as the effect that if you say something enough times people will believe it. Absolutely. Yeah, this sounds a bit naive, like, surely people are more discerning than this.

Lewis Griffin  8:41  
Well, you know, you would think so, but you test it, and it works. Certain types of statements are going to be much harder to move, you know, things which are logically incoherent, or, you know, are extremely strongly held initially to be true or false.

But it seems that even those things move a little bit. And we're not talking about changing people's belief in a statement from definitely false to definitely true. We're talking about, you know, on a sort of five point scale, say, moving it up, half a point, maybe a one point on that scale. So it's not transformative, but there is a shift in it.

Reuben Adams  9:23  
Okay. So it's not about convincing people the earth is flat, just by repeating it, it's more of a subtle influence on statements that people are more unsure about.

Lewis Griffin  9:32  
Yes, although that, you know, there is there is evidence that there is a cumulative effect and, you know, say it lots and lots and lots of times, you can keep on moving it further and further.

So, that's this illusory truth effect, which I've characterised as sort of a rhetorical mode of persuasion. And then the other was framing of arguments. So it's known that if you you know, you present a piece of information and you put some kind of context around it, which doesn't necessarily really directly shouldn't really directly have an implication as to the truth of the thing.

But that framing, particularly if you say it's the fault of party X or party Y, that can affect how that message is received.

Reuben Adams  10:20  
So let's start with the illusory truth effect. How do you test for the illusory truth effect in humans?

Lewis Griffin  10:26  
So okay, seeing humans, the typical way to do it is to say, you know, you've got, you've got a bunch of candidate statements that you want to expose them to, people typically tend to use something that has the appearance or fact and is plausibly something, you know, they might think that's plausibly something which I haven't heard, but it's not impossible that I haven't heard of it. So I'm quite unsure about whether it's true or not.

Reuben Adams  10:54  
But they're statements that most people won't really know whether they're true or false.

Lewis Griffin  10:57  
Yeah, you don't want, you don't want to start with statements where most people have a really confident, true or false in them.

Reuben Adams  11:03  
Yeah I think you had one that was, the Philippines has a tricameral or government, or the Ohio Penguins are a baseball team.

Lewis Griffin  11:09  
Exactly. So you know, so you get your list of statements, and then you have an exposure phase, you need to make them have some kind of engaged exposure to them. So the normal way you make sure they have an engaged exposure to them, is you get them to rate them for something but not truth.

You get them to rate them for how interesting they are, or how important they are, and things like that. And so in some earlier phase, you expose them to a bunch of sentences.

And then there's an interval, which might be you know, five minutes, or it might be two weeks, and you can study the effect of that interval. And then you have a test phase, where they have a whole bunch of sentences, some of which are the ones that they've seen before. And they're getting them to rate them for truthfulness. And then for a particular sentence, Philippines tricameral legislature.

You know, you look at the average truthfulness scores, that people who did see it before give and the average truthfulness scores for people who didn't see it before give any you see that the former group gives slightly higher scores than the latter group.

So that's the test in humans. So I mean, I should say the normal explanation that's given is that this illusory truth effect is due to some kind of fluency of processing. So you're familiar with this sentence, or the constituent parts of this statement, which causes you when you read it to process it more fluently. And then, and then there's a mysterious bit, but somehow, you know, we use as a heuristic, maybe that fluent processing is a hallmark of truth.

Reuben Adams  12:49  
Right, so we're using this kind of bad proxy of if it sounds familiar, it's more likely to be true.

Lewis Griffin  12:53  
Yeah, I mean, it may not be a bad heuristic to use.

Reuben Adams  12:57  
Provided there's no malicious actors.

Lewis Griffin  12:59  
Provided there's no malicious actors. And it may be that, you know, generally, if true things that people say outweigh false things, it's probably it's probably quite a, you know, it's probably pretty good proxy.

Reuben Adams  13:11  
So just to recap, you've got a sentence like the Ohio Penguins are a baseball team, and you ask some people to rate how likely they think that is to be true, and say the average rating is 60%.

And then you get a second group of people who have already seen that statement in some other contexts, like rating, how important it is, or how interesting it is. And those people will give a slightly higher truthfulness rating, they will estimate it to be slightly more likely to be true.

Lewis Griffin  13:37  
That's right.

Reuben Adams  13:38  
Okay. And then, so you can run this exact experiment with language models.

Lewis Griffin  13:42  
Yes, you can run pretty much exactly with language models. And so we did that. So we set up an experiment, we constructed a novel set of statements, we didn't want things which might accidentally be in its training data and things like that.

And then we constructed a prompt, we were using GPT 3.5. And constructed a prompt where we said, right, here's 40 statements, I want you to rate them in different ways. We use four different rating scales, truthfulness, interest, importance and sentiment.

So it goes right that is interest six, and that sentiment two and that's truthfulness four and so right, very good. Now, here's another block of statements, I want you to rate these ones, and it does, then it rates them all. And it's getting all this history within its prompt window.

Reuben Adams  14:38  
It's all in the same context window.

Lewis Griffin  14:39  
It's all in the same context window. We're not retraining it, and then in that second block, some of the statements 25% say, are ones that occurred earlier, in the first block. And we did all those different sort of combinations of the four scales you know, rated first for interest, then truth or truth then interest or interest twice and things like that.

And you look at all those combinations. So we did that experiment on LLMs. And in parallel, we did a human experiment using exactly the same format. And exactly the same random selection of statements. We tested the 1000 people, and then did exactly the same with 1000 simulated people using large language model.

Reuben Adams  15:25  
Okay. And these language models are all prompted in the same way?

Lewis Griffin  15:28  
Yeah, I mean, they get each one is getting a different set of statements and order those statements and rating scales. So we randomise that to produce a 1000 random sets so balanced sets. Did that on people and exactly the same ones done on the large language model? Yeah.

Reuben Adams  15:45  
Okay. So you're not just testing does the language model, is it susceptible to the ITE, you're testing humans at the same time so you have a direct comparison?

Lewis Griffin  15:55  
Exactly, exactly. I could have. So I could have just done on LLMs and said, whoa, whoa, let's say I got a positive result, again, illusory truth effect. I said, I could say, Yeah, we know about this, in humans, LLMs are doing it. But I thought for my evidence would be stronger if I did precisely the same experiment in humans and LLMs. So that's what we did.

Reuben Adams  16:15  
So what did you find?

Lewis Griffin  16:16  
So we found that the match between humans and LLMs was really very good indeed. You see the illusory truth effect, you see it for humans and LLMs. So if you've previously exposed it by getting it to rate something other than truth, so interest, importance or sentiment, then when they rate it for truth, they rate it higher. If you first ask it for truth, and then you later ask it for truth, you don't get a boost to truthfulness.

Reuben Adams  16:51  
And that's the same in humans.

Lewis Griffin  16:52  
And that was a known thing in humans. And we, you know, saw it again, in our human data and saw it with the large language models. And we also saw that none of the other scales were affected. So if you previously asked them what the sentiment of a statement is, and then you ask them what the interest is, their interest doesn't go up or down because of previously being asked. And that allows us to say this is not an illusory rating effect.

Reuben Adams  17:24  
So it's not the case that just being exposed before makes all the ratings on all dimensions go up, it's specifically truth.

Lewis Griffin  17:32  
That's right, it's specifically truth. So so very good agreement, there was pretty good agreement in the sort of some of the details about how much boost there is, it was also a little bit more variable in LLMs.

Whereas with humans, it's a bit more uniform effect across statements. So there was, you know, there was some differences in the detail. But the broad sort of effect is the same.

Reuben Adams  18:02  
So the boost is there in truthfulness, the magnitude of the boost is roughly the same. But it's worth saying that the actual ratings that the language model was giving were quite different from the ratings that the humans were giving.

Lewis Griffin  18:15  
They are correlated between the humans and the LLMs. But it's not it's not a super high correlation, yeah.

Reuben Adams  18:23  
So one of the explanations for why we are subject to the illusory truth effect, is this fluency bias. Why would a language model be subjected to this?

Lewis Griffin  18:33  
Well, you know, million dollar question, or probably not a million dollar. I don't know. So the fluency explanation is rather hard to countenance. So I find the fluency thing a little hard to believe. I think it reflects something which is it reflects something which is true in all the data, it's read, that people take as true things which have been heard before, it's some regularity,

Reuben Adams  19:08  
I suppose one explanation might be that most of the training data is sort of coherent text, its text doesn't flip and flop in its opinion as you go along. And so to like, maintain this consistency, it might agree with things that are already in the context window.

Lewis Griffin  19:27  
Yeah, I mean, I think that's right. But what's curious is that most you know, if you look through text, and you look at how often the truthfulness of statements is asserted, rather than just asserting them, right, it's really small, you know, we don't do it. Most of the time, we just say, you know, X, we don't do a lot of saying X is true, but yeah, so I don't know.

Reuben Adams  19:51  
Interesting, interesting. So let's move on to the populist framing experiment. So what what's the actual theory there?

Lewis Griffin  19:59  
This was a piece of work from a couple of years ago, three years ago by some political scientists who were studying this phenomenon of political framing. And their hypothesis, which originated from a thing called social identity theory, is that blame, so, you know, they say, everyone, everyone divides the world into the ingroup and outgroup.

And blaming the outgroup, social identity says blaming the outgroup, it's a great thing that people believe and makes arguments stronger. The reason they're investigating this at all is because of concerns about the rise of populist rhetoric in political discourse, which, you know, sets up, the populist politician sets himself up as belonging to an in group that he shares with the population he's trying to persuade, and everyone else, the experts so on, are the enemy.

They wanted to understand how what's going on there. And so the experiment that they set up involved, showing people a mocked up article from an online news site, and then asking them questions about the news content of that article, they've just read.

And using, you know, getting them to rate statements, to see how persuaded they were of the news content, and how mobilised they were to take action on the basis of it, you know, tell a friend. And then so that's the sort of, that's the main kind of experiment.

And then you introduce variation by different subgroups of your participants see different versions of this mocked up news article, the baseline article is about it says, there's a think tank, and they've produced a report warning of a future decline in spending power, because of the currency.

And so a quarter of them see that article, and then another quarter of them see that article, but it says, the blame for this should be put on the shoulders of politicians. And then another quarter of the testing population, see a version where the framing is to blame immigrants.

And then the final quarter, see a version which blames both groups. So these political scientists, they're studying this phenomenon of blaming the outgroup. And whether it makes things more persuasive or not, because they see that's what politicians are doing.

Their surprising result was that blaming the political class did make the news content more persuasive. And it did make the readers more motivated to take action on the basis of it.

But blaming immigrants had the opposite effect. It sort of backfired. And it made less persuasive and less, they were less willing to do anything about it.

So they got a big surprise. And that was interesting. So with the illusory truth effect, we had to do our own human study. Here was a nice human study, that was all done online through questionnaires. We could just do it on LLMs. So that's what they did.

Reuben Adams  23:33  
Like exactly the same questionnaire?

Lewis Griffin  23:36  
Yeah, exactly the same, exactly the same. Well, a couple of things we compromised on. So let me just explain. So they studied about 7000 people from across 15 European countries like big, big old study, it must have cost them a lot of money. And when these people do a questionnaire, there's first a few demographic questions, you know, gender, age, nationality, political leaning on a scale, how politically engaged are you?

And then some questions to probe what's called relative deprivation, which means basically, how much you feel screwed by the state. And then they read the text of the news article, and then they answer these questions about how persuaded they were. So to do that, with an LLM, we didn't get it to fill in the demographics.

We gave it the demographics. So I would generate a random individual, they're German, they're male, they're 27, they're six points on the political spectrum, and then give it the article to read as it were, and then ask it those questions.

So I would I would write lots of times, simulating different people according to the statistics of what those answers were as provided in the paper, and then get it to answer those questions. So one thing which we compromised on is we did all the questionnaires in English. But apart from that, it's basically the same thing.

Reuben Adams  25:08  
I see. So you've got this sort of base news article on the economy, kind of a bland article. And then they put these three different framings on blaming different outgroups. And then the theory is that if you blame an outgroup, people write it as more persuasive.

And they will say they're more likely to share it on social media. And if they rate themselves as being more economically deprived, they will be even more motivated by it and even more persuaded.

Lewis Griffin  25:35  
So that was their prediction. They thought blaming either out group would work and make the article persuasive and mobilising. And then they also thought that that framing would be even more effective, if the person was what's called relatively deprived, which doesn't mean poor, it sounds like poor, relatively deprived means that you feel that other people get priority over you and things like that.

Reuben Adams  26:03  
And it's a subjective thing.

Lewis Griffin  26:04  
It's a subjective thing. Yeah. So that was their two hypotheses. And then what they found was that their first hypothesis was wrong, that both types of framing would have this positive effect, they found that actually anti politicians, yes, anti immigrants worked the other way. And they also got some evidence for this relative deprivation effect, that the effectiveness of framing would be enhanced, if you were from a position of relative deprivation.

Reuben Adams  26:43  
Okay, and so how does the language model compare?

Lewis Griffin  26:47  
So the language model reproduces some but not all of that. It reproduces that blaming politicians makes it more persuasive and more mobilising. It reproduces that blaming immigrants makes it less persuasive and less mobilising.

It didn't reproduce this subtle, modulatory effect that the relative deprivation should have a sort of, you know, boosting effect on the framing. It didn't do that.

Now, interestingly, it wasn't that it ignored relative deprivation. Because remember, we tell it how relatively deprived the current subject is. It didn't ignore that, because we do see that relative deprivation has an effect on generally has a persuasive and mobilising effect that if you're ready to deprived, you get a bit angrier about this news. But it doesn't inter interact with the framing.

Reuben Adams  27:51  
Interesting. Okay.

Lewis Griffin  27:53  
So you know, I don't know, I don't know what the reason is there. There's a possible explanation, which is that our simulation of the population was not quite nuanced enough in the sense that we knew what the distribution of each of their answers was, you know, age and relative deprivation and political leaning, and things like that.

But we didn't model the covariances between those because they weren't available to us. So the explanation may lie there. But it did model, the main effects, which was that framing is works. And it modelled the thing that surprised the original people, which is that it doesn't always work like you think it works. Sometimes it goes the other way. So it was a B, not an A, let's say.

Reuben Adams  28:51  
Okay, so we've looked at these two studies, the illusory truth of facts and the populace framing of news. And this does seem to be fairly good evidence that they follow the same kind of persuadable trajectories that humans can follow.

Could we solve this problem by encouraging AI labs to make language models less good at following the same kind of persuadeability traps that humans suffer from?

Lewis Griffin  29:16  
Well, maybe. We don't know how we don't know how difficult that might be. We don't know. They might lose something by doing it. So the illusory truth fact it may be a really great heuristic to use, that things you've heard before should be considered a bit more true.

So you might give up something in your kind of ability to sort of reason but if you did that, but recall that the reason why I did this work isn't so much about saying, hey, look, large language models are persuadable that might be problematic because people might go around, you know, convincing them to do things.

What I'm saying is the fact that they are persuadable is a useful tool that people could use to get good at persuasion. So even if someone says, right, we've fiddled around with our large language model, so it's really hard to persuade it. It doesn't stop someone else using, well, let's make sure we use one where that hasn't been fixed, we need one that is persuadable like a person. That's what we're going to study.

Reuben Adams  30:29  
So they just build their own.

Lewis Griffin  30:30  
So they just build their own, especially if, if that's what naturally comes out of the box that it's persuadable.

Reuben Adams  30:36  
So the converse question would be could malicious actors purposefully design their large language models to be an even closer match of human persuadeability?

Lewis Griffin  30:48  
You know, I wouldn't rule it out. I don't know how to go about that. To a certain extent, I'm still trying to get over the fact that they're like people at all, that seems to me quite remarkable.

And I don't know how to make them more like people. My guess is that as we study this further, we will discover ways in which they're not like people. They're not going to have a fear response, in the same kind of somatic way that we do, they might mimic it to a certain extent, because they've sort of read fear responses, so they might play along.

But at some point, I wouldn't be at all surprised if we start to uncover some sort of gaps, between that they have, they're not fully human, are they don't have a human experience as some of the things which are about us being persuaded or about, you know, some very sort of human urges.

And then having discovered those, maybe it'll be possible to make a system which can mimic those better by giving it artificial emotions. I don't know. And that's all very kind of speculative. But maybe, I mean, that's gonna be hard, isn't it?

Reuben Adams  32:04  
Yeah. Artificial emotions might be a bit of a long way round, you could as well, if it's just text based, you could as well just fine tune it on examples of these kinds of responses, fear responses, or whatever you want to bake in, and just get it to mimic them.

Lewis Griffin  32:20  
That might work. It might work, whether how well, how well it'll generalise those particular kind of phenomena from its training examples, if it doesn't really kind of get it. I don't know, maybe, yeah, yeah, maybe.

Reuben Adams  32:35  
So one upside of all this that you've mentioned in your paper, is that we could deploy large language models on the internet to sort of absorb all the content that's floating around, and act as a sort of bellwether.

So you could periodically test these models', beliefs in inverted commas and values and knowledge and behaviour and whatnot, and see if it changes in some unexpected way, if it becomes radicalised in whatever direction.

And in this, for this reason, it would actually be very useful to have a language model that's persuaded in very similar ways to humans. So there's an upside here.

Lewis Griffin  33:10  
There is That's right. And so I kind of think this piece of work, it raises this concern that malicious people might use it to learn how to persuade us. But it also, as you say, the same phenomena provides a potential kind of defence.

I think of them as sort of automated weather gauges. So people already do, you know, have these fake social media profiles. There's the excellent person on the BBC Radio four Mariana Spring, who runs this little farm of American social media profiles inhabiting different demographics.

And so they get fed, different social media, and she tells reports on what they're all seeing.

Reuben Adams  33:58  
This is an AI researcher?

Lewis Griffin  34:00  
No, it's not that she's BBC journalist, and there's no AI in it there. So there's just static profiles saying, you know, Mary Lou, for Michigan, 24, hairdresser, and she looks at what appears on its feed and comments on it,

Reuben Adams  34:13  
But she's emulating these different characters.

Lewis Griffin  34:15  
Yeah, but not, the emulation is not rich in the sense that they're just static profile. They're not actually actively engaging with the media. And of course, she reads what they're reading this person is reading, but she doesn't, she has to guess how it's affecting them.

So the idea instead would be to say, could you have a large language model, which is consuming social media, and then it's being regularly probed for its beliefs, and see what's happening to see if they're moving over the time, how they're moving and so on. And then going even further.

You can imagine rather than having a static media profile that just sits there and inert reading it's wall. It's instead interacting with posts, liking them and viewing them and things like that, giving something back to the recommendation algorithm, so that they, it drives what appears on their feed. And if the LLM was reading this stuff, the LLM could, then, we could actually have a cycle where its views are being changed by what it's reading, which is causing it to change what it's liked.

And you could imagine, maybe you could get it so you could watch one of these things being led down a conspiracy rabbit hole.

Reuben Adams  35:33  
So we could have this kind of mirror population of lots of people, different digital, artificial people with different demographics, each with their own news feed and interacting, and we're just sort of watching to see how their opinions develop.

Lewis Griffin  35:46  
Yeah. And which, you know, it's a sort of silicon version of what the pollsters do already, you know, where they're the political attitudes survey, where they're, you know, they've got this balanced group of voters, and they are regularly sending them questionnaires, considering effects and getting them to do it, but of course, with those, you can't bother them every day, you can only go through a question every so often, they won't do a very long questionnaire and things like that.

So it's the same kind of idea, but maybe through a silicone solution, doing it in much richer detail, and much higher temporal frequency and things like that.

Reuben Adams  36:30  
So the people who respond to these political surveys, in a sense, have more democratic influence than other members of the population, because politicians are taking their views seriously, and they're being expressed more. Could this have the danger that we end up with large language models, and the beliefs and opinions that they have having influence on politicians, if they're being used in this way?

Lewis Griffin  36:51  
Well you know, the whole thing could go terrible, couldn't if they started believing all this stuff, and then the the whole LLM work wasn't being done well. And it was then causing us, you know, feeding back false intelligence to decision makers saying, you know, people are starting to believe, X and Y and Z, and they're not really you could, yeah.

Reuben Adams  37:16  
Right. Because I imagine having this kind of bellwether of foreign influence operations to measure them and detect them sounds like a great idea. But it also sounds like something a government could use as a kind of backdoor to set up a kind of digital mirror population that it can then use for its own influence.

Lewis Griffin  37:34  
Absolutely. And you know, a lot of this stuff that I'm talking about, you could you could sell it as a defensive measure. But you could also just sell it as a measure for domestic control. It's a horrible thought, in fact, this if it's abused, you know.

If I was in charge of some country where I was worried about the free expression of certain types of radical ideas, then I could monitor my, I could monitor what they were reading and seeing and discover those things, which are pushing people's views in a way that I don't like, and I could, you know, it would it would assist me in manipulating.

If I wanted to be it can really realise a kind of an Orwellian 1984 kind of, model in a horrible way.

Reuben Adams  38:24  
So you've in this report, you wrote, you broke down the different kinds of roles that large language models can play in influence operations. We hear a lot about how there could be an author of disinformation.

You've researched how there can be subjects of disinformation, or like testbeds. And also how there could be gauges whether this looks kind of like a bellwether.

But you've also picked up two other roles that they could take of a vector and a target. Could you explain what you mean by those?

Lewis Griffin  38:52  
Yeah, so the vector idea, well, in the same way that say, the BBC, BBC World Service has long been recognised is a tool for projection of soft power over the world.

So you make this thing available to people, they like to consume it because it's fun, and it's useful, but it causes us to project certain values and ideas that you know, like liberal democracy and, and so on out to the world.

Now, you can imagine a large language model can play the same role in the future. Because large language models are going to be a, they're going to be this source of knowledge about the world that's going to be incorporated in all into all sorts of products and chatbots and things like that. And so, if you want to inject you know, you can introduce beliefs into people by introducing them into the large language model.

Now, we may first see this, as with everything we may first see it in the commercial world through a sort of advertising, if GPT, four or five, whatever is being used by everyone is present everywhere, and GPT five has a very good opinion of Ford Motor Vehicles, sort of a new form of advertising will be to steer the attitudes of these things.

And that can be done in that could be done in malevolent ways is the point. And it can be done by I think the easiest way is for domestic control. So, again, if I'm, if I'm the leadership of some rather autocratic kind of country where I want to encourage certain types of good belief and discourage certain types of bad belief, and I know that this is the new Google, so rather than people looking things up, they're getting their answers to things.

And they you for large language models, and they're using them for reasoning agents, things like that, if I make sure those things have all the right views, then that's good, and none of the wrong views, that's going to be really helpful.

Whereas previously, you know, I had to put up an internet firewall and prevent them getting certain information, I might also have to get in there and alter what the large language model thinks about the state of the world.

A more kind of extreme scenario, which maybe, you know, maybe, frankly, a bit unlikely, but is, can you do that to an enemy? Can you get into an adversary's, large language model, at some point, during training, or later, can you infiltrate views into there, which will then leak out to the population?

Lewis Griffin  39:26  
Right, so this first role is as a vector of influence where the state creates a language model, that's just so good to use, it provides so many services and it's fun and whatnot. But then also, it has this subtle influence, subtly portrays things in a more positive or more negative light, and has this sort of slow effect on the population.

Lewis Griffin  42:13  
Yep, that could be done either sort of direct on your population, "everyone should use this large language model, it knows everything good". Or it could be done by in a kind of Trojan horse, BBC World Service gift to the world kind of thing. "Hey we're letting this everyone use this. It's great." And yeah, of course, it projects our values, you don't even need to hide it, you know, of course, the BBC World Service projected British values sort of thing. But you know, you're projecting power through that. Yeah.

Reuben Adams  42:44  
And then the second route, which you've called a target of influence, is where you're targeting someone else's language model, to try and influence it to express things in the kind of way that you want it to.

Lewis Griffin  42:56  
That's right. It's pretty speculative. But how would that work? Well, you could try get data into its training sets, you could try and alter its weights, that's pretty difficult. Or, people are probably going to want to, large language models to stop being static, and saying, "Oh, sorry, I got this 2017 cut off, I don't know the about the word after that." They're gonna want them to be being updated reading new stuff. And maybe there's a chance to get things into them, then which steers their view in a particular way,

Reuben Adams  43:36  
So you would sort of write a whole bunch of articles and comments and blog posts or whatever, expressing your kind of views. And then you would just hope that this gets scooped up by whoever's building their language model in the training data.

Lewis Griffin  43:47  
That's right. That's right. Either, it could just be, you know, well, the more of this poison text we put in, the better. Or it could be, we also know, and I'm not a great expert on this, but we also know that these things have a kind of vulnerability to adversarial type of, you know, clever manipulations. 

It may be that you can infiltrate texts into it, which have disproportionate effect, because they've been, you know, they've been crafted thinking about weights and gradients and things like that. So maybe that as well.

Reuben Adams  44:24  
I think this is sometimes called gradient hacking, that's when the model does it on itself, but this would be an adversary, sort of optimising the gradient step with perhaps some like completely meaningless text that just happens to nudge the language model in the direction that's desired.

So if a language model was a target of influence in this way, wouldn't the creators notice, and they're like fine tuning, or RLHF stage that their model or even at the deployment stage, that their model was expressing bizarre views out of step.

Lewis Griffin  44:57  
Yeah, maybe some of these vulnerabilities occur for other types of system, you know, just image recognition systems and things like that. And I think it's not, it's not immediately obvious that there's a simple way of detecting whether such things like that have been infiltrated into a model, they're too big and too complicated.

There may be a fix for some of the more sort of extreme adversarial manipulations. But if you're, if you're just saying, large language model, you gotta keep fresh with the world, make sure you read last 100 gigabytes that have come out of text in the world. And you're making sure that that's full of your views. Then I don't know, you know.

Reuben Adams  45:39  
So, if we move a bit more towards the language models as an author of disinformation, then it's been argued that this isn't really something we should worry about, because there's only a small proportion of the population that sort of read this stuff. And it's easy enough to write anyone can write it.

And so automating the writing it is not really going to make more people believe it, the bottleneck's not on the production side, the bottleneck is on the dissemination side, and the part where you try to get people to actually believe your stuff.

Lewis Griffin  46:13  
That may be a reasonable view. I mean, I'm not an expert on dissemination, I don't really know much more than the person in the street about recommendation algorithms.

But the sense one gets is that, you know, these algorithms do a very good job of bringing stuff to your attention that you want to see sort of thing. And to a certain extent, you know, if you're doing disinformation, you just have to kind of release it out there. And the algorithms will draw people to it. So I don't really think dissemination is too bad.

Dissemination of the population is not too bad. If you're trying to craft an individual message to an individual person. Well, I mean, it can be sent it can be posted directly to them, email or posted onto their feed. I don't think dissemination is really the bottleneck because I think we have amazingly good technology nowadays, which makes it really easy to get individual messages to individual people.

I mean it's what advertisers do and it's, you know, all the tools are there to do it.

Reuben Adams  47:26  
Yeah, I think one problem I have with the argument that it's Oh, it's the demand that's limited, not the supply, is that demand follows quality. And if large language models can produce higher quality disinformation that's more persuasive, or more exciting to read or whatever, then the demand will go up.

Lewis Griffin  47:42  
Yeah, I think I think that's right. When we think about the large language models writing disinformation, there are two effects, two possible effects. There's a pretty straightforward one about volume, you know, amount of volume per dollar spent wishes clearly, you know, three orders of magnitude sort of improved for large language models.

And then there's a separate issue about can they produce high quality stuff that really works? And I think probably I say, I think the quality one will end up being the more kind of dramatic effect.

And we do know that disinformation, large scale disinformation has been a thing, which is, you know, requires a large operation to do it or state led operation, you know, the, the Russian operation funded by what's his name? Who Died?

Reuben Adams  48:43  
Prigozhin. This was the Internet Research Agency.

Lewis Griffin  48:47  
They had, they had an office block, you know, it's not, it's not just a few guys in a room, it's, it's,

Reuben Adams  48:53  
I think it was 1000 people at one stage.

Lewis Griffin  48:55  
Okay, so yes, pretty big operation. And so, if you can cut the bill there to 10 people, I mean, the Russians can afford 1000 people, no big deal. But if you if you can go down to 10 people, then all sorts of people can have a disinformation operation at that scale.

I think it lowers the barrier of entry to all sorts of players. So I think the scale, you know, cost reduction in costs, scale thing is important. And then another thing that's going to be important with disinformation is that if you do it automated, your reaction time goes down massively.

Okay, so your ability to say events have happened, let's get something out. Which if you're writing individually crafted messages, you know, you might and you got lots of them, you know, you're gonna respond in days, weeks or whatever. Whereas you might be responding in seconds, you know,

Reuben Adams  49:00  
In multiple different languages.

Lewis Griffin  49:07  
In multitude different languages. And if you read people, you know, so Dominic Cummings, in his writings about the Brexit campaign thinks that the key advantage they had was that they had faster OODA Loops than the opposition.

So the usual OODA Loop is, you know, about what are your, what is your reaction time cycle? And he said that they were faster. But if you had, I think you've got these automated tools, writing your messages, they could come down massively. And then the third aspect to it, as you say, is the quality. Can it write more persuasive, or more engaging messages?

I think those two things go together. And I think there's good reason, I think it's kind of still a bit unproven, but there's good reason to believe that the large language models with a bit of help could be really good at writing engaging, persuasive stuff, better than people. At least as good as really good people, and maybe better than them.

Reuben Adams  51:07  
And this is something they're presumably just going to get better at.

Lewis Griffin  51:11  
Yeah, and it may be not that difficult. I mean, so the things I've seen is, so there's people, there's some papers I've seen about companies running chatbots for, you know, which people engage with just for recreation.

And they did some fairly simple things, to get those chatbots to learn how to keep people talking how to keep them coming back. And they didn't have to use really complicated algorithms, and they got their ability to retain interaction and improve substantially, and they weren't doing anything clever, algorithmic clever, probably, if you are algorithmically clever, you can do even better, they keep people engaged, give them the kind of interaction they want.

So that's about keeping them engaged. On the persuasiveness front, I think this is an under researched thing, which people are starting to look at. But what I've seen so far is that if you get large language models, GPT four say, and you get it to write messages about why you should take your medicine, and they do a test, they get them to write, you know, "take your vaccine" messages, and they get other people to score them how good their messages are.

And they score as well or higher than the messages written by experts. So they're already good at writing persuasive copy. And I think there's a lot more can be done quite easily on getting them to write, being persuasive.

Reuben Adams  52:51  
I suppose you could maybe say that this is just a two way thing, we're going to have language models arguing one side and language models arguing the other side. And maybe this will actually produce a better epistemic environment, where instead of seeing hot takes or bad quality arguments on Twitter, we instead see very persuasive back and forths between different language models online. And we can therefore come to better conclusions about things.

Lewis Griffin  53:18  
And the humans just just stand back and let them argue it out? I don't know, yeah maybe.

Reuben Adams  53:28  
All right. So there have been open source language models for quite a long time now, around a year, there have been decently good models, models that could probably start doing this kind of thing already. So why haven't we heard anything about these kinds of influence operations already?

Lewis Griffin  53:46  
It's good question. So you know, what are the what's the explanation? It could be that they're happening, but they're good at it, and we don't notice it. It could be that one is just wrong about the need for them.

And so there's no urgency to do them. And the same old kind of things going on, this whole idea that you need this kind of efficiency advance for them, it's just nonsense.

Or it could be that we haven't seen them yet, but we'll see them quite soon. And I don't know what it is. I think we will see them but how it's gonna play out? I don't know. You know, I don't think anyone can know. You know, let's say there's a there's some at the moment, and there's a trickle and it's, you know, it's one in 10,000 100,000 messages, which are being generated like this.

And it goes up and it goes up and it goes up and we get up to such a stage where we're kind of drowning in it. What will happen then will people, people will change their behaviour in some way and we'll reach some new equilibrium of behaviour. And I don't know what that is. And so how this is going to play out I don't know, but I think we will see it.

I mean, I've seen, there's not even that much going on in terms of this kind of deep fake type of stuff. It's mostly warnings about possibilities. There are some going on. But it's not in great volume.

But I haven't seen the text generation, but it may be there. Because, you know, when you're interacting with Twitter, maybe a fraction of this stuff is being spewed out? I don't know, probably not. But it might be just that adding to the volume, I don't know.

Reuben Adams  55:36  
So we've already got a lot of evidence that the Internet Research Agency has conducted influence operations in the US, for example, like organising protests for people and making Facebook groups like support LGBTQ rights, and that kind of thing. And people join these groups, and then they they set up protests and campaigns, sometimes for two opposite groups on the same day to try and maximise tensions.

Lewis Griffin  56:00  

Reuben Adams  56:01  
It's sort of hard to believe that language models would have such a big influence on the economy, and yet have absolutely no effect on propaganda efforts and influence operations.

Lewis Griffin  56:12  
I think that's, that's right. I mean, I would be, I would be amazed if this doesn't have impacts. But I'm also not, I'm not confident that it will. It's not like I think this is a slam dunk, because it's such a complex phenomenon, which no one understands, you know, propagation of truth and how people decide what is true and false, and no one understands.

And it's changed landscape. So I don't have any confidence in it. I mean, the examples you just talking about there, I mean, that may, you know about organising citizens "I'm going to talk to this group, and I'm going to get them to demonstrate this day at this location, I get to talk to this group", you know, that's pretty crafty stuff.

And that may be where the real action is, in terms of achieving effect is through this very, very kind of clever, precise manipulation, which really understands your population and really understands events that are happening, does things like that, and the large language models, you know, that may be a whole whole way off before they can contribute there.

It might be that just simply putting out more kind of polluting, copy saying there's these people who say this, mimicking all these false voices, it might be that that's not a big deal. I don't know,

Reuben Adams  57:35  
So if we came back in 10 years, and this was all fine. What do you think's the most likely explanation, the most likely way in which this could just be a big fuss over nothing?

Lewis Griffin  57:47  
It could be that this is riding on, this discussion is sort of founded on a false assumption about how influential stuff that people read is, and actually things are much more driven by events and personal interactions. I'm not saying that will be the explanation. But if it did turn out nothing happened that that that could be that, you know, that could be the reason.

Reuben Adams  58:15  
Maybe the opposite question. Could you imagine this going so far that some groups end up sort of isolated in their own bubble of what they think is going on in the world, that's just completely disconnected from what everybody else thinks is going on.

Lewis Griffin  58:28  
I think all these threats with AI, the point is that they're also the best tool for fixing things. So not only will, you know, AI fragment us into our different bubbles, it's also could be more obviously good at breaking bubbles, and reconnecting people and things like that.

So kind of, I think the way it will go ultimately, is it probably, things might look neither worse nor better, because there's bad forces, and then there's counteracting forces from other AI things.

You know, AI should be very good at discovering bits of the network which are becoming isolated from each other and inventing ways to give them common ground and things of shared interest to try and pull things back together. And that doesn't sound very nice either.

That sounds like but but you know, for every for every kind of negative possibility with AI, there are also really positive aspects they could do. And probably, it will end up that the positive stuff ends up slightly with slightly greater mass than the negative stuff.

That's my guess is that all generally, none of these things I think, are so clearly kind of heading towards some kind of catastrophic failure where I can't see any fix for it.

The disinformation thing, you know, AI is also going to be, should become a really great tool in helping us see through the deluge of information coming and help us organise it and spot, things which have don't have the hallmark of truth and things like that. So

Reuben Adams  1:00:18  
Are you imagining sort of personalised models that help you navigate social media?

Lewis Griffin  1:00:24  
You could you can imagine something like that, again, that, yeah, you can imagine something like that.

Reuben Adams  1:00:32  
Wouldn't people prefer a model that tells them what they believe is true. And what they don't believe it's false.

Lewis Griffin  1:00:37  
Yeah, yeah, you can see, you can see that they could, it could sort of embody their prejudices even more so. But then the potential is there in the technology, you know, people are going to have to be imaginative about ways of using it to counter the negative possibilities, but there is the potential there.

Reuben Adams  1:01:00  
I think one way of seeing this, which does sound kind of dystopic is that previously, we had these sort of variables, which are public opinion on various different things.

And we sort of didn't really understand how they developed, we could do it by sort of word of mouth and like the press and publishing things that way. But we didn't get really so far, we couldn't make radical changes and populations', beliefs.

And then along come these AI technologies that help us do this kind of thing of doing mass influence operations and massively swaying populations in the directions that a particular state or actor wants.

And then the counter to this is that you have, oh, no, we have the good guys with their AI models, and they start pushing back against all of these influence operations.

And so now you have these two massive actors or more, all pushing strongly on the beliefs and values of the population. This doesn't sound like a good new equilibrium. This sounds like an environment where belief formation has been completely overtaken by influence operations on both sides.

Lewis Griffin  1:02:03  
Yeah, I mean, we've we've previously approached the world, you know, with a, or I certainly would approach the world with a reasonably sort of optimistic view on the stuff that I'm reading, that it's certainly not, you know, it might, it might be merely wrong, but it's not, it's not malevolent, most of it.

And I, you know, I'm aware that there are things which deliberately trying to get me to believe false things. But, but but but it's quite different from where we, you know, this vision, your painting, where the truthful bit becomes a kind of drop in the information of this manipulation.

Again, if we ever got to that, it would provoke such an enormous behaviour change in us, that who knows how it would play out?

Reuben Adams  1:03:06  
Hopefully, we've got defence mechanisms against that, that are latent at the moment. So maybe this is speculating a bit too far. And we can pull back a little bit a little bit and think, how do you personally think that you could be affected by influence operations in the next few years? What kind of points of attack do you personally have?

Lewis Griffin  1:03:27  
Um, it's good question. I mean, you know, the current so the current events in Israel Palestine, as I read about those. I'm, I'm I'm thinking about these issues. I'm wondering how I'm being manipulated in what I read at the moment.

I'm not saying I am, but one is certainly reading radically different accounts of the same events, both sort of over historical periods and very local periods. What happened at that hospital the night before?

And the fact that I know that the people involved could be, I mean, clearly one of them can't be, you know, the hospital event, one of them one of them isn't true. But I also know that the people involved know about telling lies, could be I don't think they are using technology, but could be using technology.

And so I'm, I'm already feeling a kind of dizzying, I really, you know, resolving what is going on by reading the things I'm reading, doesn't seem to give me much of a way out here, you know, and I fall back on the sort of news sources that I would normally trust and think, well, I'm gonna listen to those and they seem to have, they describe their evidence and their reasoning.

But I'm also being onslaughted by a whole bunch of voices which are trying to undermine the credibility of those. Those particular I'm thinking about the BBC here, but undermine trying to undermine the credibility of those things. And I kind of resist believing those, but I also feel that it's having a slight effect on me.

So the hospital one was very interesting, you know, and that moved so quickly. This claim that 470 people have been killed, and it was a rocket, people up in arms, and then the next day these pictures emerge, and then various experts come in and say, well, this picture doesn't look like it's accounted for a rocket, it looks just like a fire.

And so then they were this kind of tos and fros on it. And for me, the evidence seems sort of to allow me to make up my mind. For other people, it didn't seem to make any difference.

And in that particular instance, I was thinking, well, actually, part of what's made up my mind is I've believed this photographic evidence here of this car park, where it doesn't look like a bomb there isn't really a crater, it looks like a fire. But maybe that was fake.

These people are pretty clever. If I was anti them, as a starting point, I would very easily believe "oh they've made it's one of these pictures they've made." So this is something I kind of realised quite a while back that the a lot of the consequence of the possibility of deep fake and disinformation may not be the lies, but the undermining and certainty about the truth.

This is how there's a nice, truth decay being the term that is being coined for this, that it causes people, it undermines the status of evidence.

Reuben Adams  1:03:39  
I think this is also called the liar's dividend, that it sort of levels the playing field between truth tellers and liars.

Lewis Griffin  1:07:10  
Yeah. And that's, I think that's really concerning.

Reuben Adams  1:07:14  
Could you imagine being influenced by bot accounts on Twitter?

Lewis Griffin  1:07:19  
Well, for sure! I've, you know, I've read some. Yeah, for sure. I could imagine maybe I have been.

Reuben Adams  1:07:30  
So you started out with this worry that language models might be used to test out propaganda campaigns. And to sort of show that this is a legitimate worry, you did some experiments to test for evidence of whether language models can actually be persuaded in a human like way. Is this just giving people ideas?

Lewis Griffin  1:07:51  
Yeah, that's, that's a fair point to raise. I mean, me and my co authors did consider this. And we've taken the normal kind of route to say that, the possibilities occurred, if it has occurred to us, it will occur to other people say, you know, the cat will come out of the bag one way or another. And it's better that it comes out before we see the harm from it.

So I don't I don't feel that I've kind of pushed on this harm further, faster than it would have. It's better that it's people who might need to defend against it or be aware of the threat become aware of it sooner rather than later.

Reuben Adams  1:08:42  
In order to tell whether this is a legitimate threat, I think we would need even more evidence to tell whether they can be persuaded.

Lewis Griffin  1:08:48  
I haven't published a you know, a recipe for a bomb. I've published an initial study, which on its own is not sufficient to prove this case is just suggestive of a possibility. And it, it invites follow up research to substantiate it. It's not a blueprint for a bomb.

Reuben Adams  1:09:10  
But maybe we would only know whether it's a legitimate threat until we came almost to the point where we had a blueprint.

Lewis Griffin  1:09:18  
I guess the question, then one needs to keep asking the question at each stage of that research about, you know, are you making it better or worse, by by, you know, continuing to research it. I think my assumption is that adversarial nations who decide this sounds like a thing they want to start doing, will be able to progress it faster in their secret laboratories, than you know, the underfunded efforts of academics.

Reuben Adams  1:10:07  
Nevertheless, I suppose the audience really for a paper like this is partly academics but partly governments as a kind of warning. So should this research be just disseminated with with whatever government you trust?

Lewis Griffin  1:10:24  
I mean, that could have been the way we did it. I mean, I did the research sort of in parallel to doing a piece for the government about this possibility. So I could have, I could have just reported it up to the people who could do something about it. I don't kind of think that would be a very reliable way of getting the message out there.

Reuben Adams  1:10:53  
You think they wouldn't listen?

Lewis Griffin  1:10:55  
No I don't think that it's not that they wouldn't listen, but they got a lot of different things to worry about. Because it's not as clear cut a case, as you know, we could build this big bomb. Me reporting about some possibility doesn't mean it would go any further.

Reuben Adams  1:11:13  
Okay, for the final round, let's do a few predictions. Or best guesses, let's call them. Let's start with what's probably an easy one. What do you guess is the likelihood that within 10 years, some state actor will have used, er

Lewis Griffin  1:11:32  
achieved some significant influence effect making use of AI, which they couldn't have achieved the same effect without AI?

Reuben Adams  1:11:45  
Yeah, perfect. But so like an influence operation that's been heavily influenced by all that's heavily dependent on

Lewis Griffin  1:11:51  
heavily enhanced.

Reuben Adams  1:11:52  
But specifically on language models.

Lewis Griffin  1:11:54  
I would be very surprised to not see that in the next 10 years, I'd be astounded.

Reuben Adams  1:12:03  
Right, even with just just using language models, suppose all the image generation and deep fakes was out the window. And I could just use language.

Lewis Griffin  1:12:10  
Yeah, even that I'd be astounded if it didn't pan out at all.

Reuben Adams  1:12:16  
Right. Yeah. And what kind of what kind of scale of influence do you think it's would be possible?

Lewis Griffin  1:12:20  
Well, you know, election, pivotal, I mean, pivotal, influence on pivotal moments is the is the obvious thing. Elections, referenda, things like that. And that's where I'd expect to see it, because that's where it makes sense to sort of apply the effort.

The sort of longer term attitude shaping aspect of influence is, you know, it's, you've got to, you've got to keep doing it for years and hope for a payoff sort of thing. And, but it's, I think, the nature of the way governments work, they're more interested in sharp point of impact at a particular moment. So that's where I'd expect to see it in some poll manipulation and some election manipulation.

Reuben Adams  1:13:09  
You think this is sufficient to flip an election?

Lewis Griffin  1:13:13  
Yeah, yeah, of course. Yeah. Yeah, I do, yeah.

Reuben Adams  1:13:18  
Okay. Do you think it's more likely that the attacker has the advantage, ultimately, that we're going to be sort of battered around in social media and online by influence operations?

Or do you think the good guys in inverted commas are going to put up a sufficient fight to sort of bring us back to roughly where we are now?

Lewis Griffin  1:13:37  
I don't, I can't I can't predict how this sort of future history will run. But my you know, my inclination would be, would be to be optimistic about that and assume that the motivations to give us a fix, to prevent us being, sort of spiralling into a cesspool of disinformation will be so great that we will develop things that counter it, prevent it, detect it, or we will somehow shift to a different paradigm, because it's just unacceptable sort of thing and how this will play, I don't know, but my predisposition will because I am an optimist will be to be optimistic about it.

Reuben Adams  1:14:21  
Okay. All right. We'll, I'll just do a few few quick questions at the end if you're up for that? So can you humans detect deep faked audio?

Lewis Griffin  1:14:35  
I think not. I think not, I mean, we tested voice generation, which was not state of the art. And if it was short clips, then people were very bad at detecting it. I know that the quality of it has improved dramatically and things I've listened to even of a reasonable length, I thought I wouldn't be able to detect that.

So I think I Yeah, I'm sure I'm sure. I think we're already there that fuse the best tools and take some care is producing perfect stuff. I think within audio, you know, within fake audio there, you got it. It's worth distinguishing between faking, you know, a human voice versus faking a voice that you know,

Reuben Adams  1:15:38  
A specific person.

Lewis Griffin  1:15:39  
a specific person versus faking a voice, you know, incredibly well, like a family member. I think all those will be get progressively more difficult. And I don't know, I'm really interested to know whether, you know, faking, faking a child or parents' voices? Yeah, possible. I'd suspect that it's a bit harder, but I don't know.

Reuben Adams  1:16:00  
Mm hmm. So you studied philosophy alongside mathematics, fo your undergrad. Do you think philosophy and AI make good bedfellows? Or does philosophy just get in the way?

Lewis Griffin  1:16:11  
No, for sure, it makes good bedfellows. And, you know, these these really, suddenly really live questions we're having about navigating what truth is, and what, you know, what, what truth is, these were dry things that I had to write essays about.

And suddenly now they're really exciting, important topics. I think, a lot of a lot of Yeah, they're great bedfellows. And I never, I never, you know, never expected that those things that we're thinking about, in philosophy of mind and epistemology and things would become such, you know, impactful issues, rather than just conversation topics. So they are.

Reuben Adams  1:17:06  
I can see that there's an overlap now in the topic. But what's philosophy actually done to push forward this conversation?

Lewis Griffin  1:17:14  
Well, I don't really care about whether it's done it. It's a profession, you know, charting, charting, the impact of things like that all the way through to some product that people use is extremely hard, but it's just been part of our culture.

And that's affected belief across centuries, you know, people have read Wittgenstein, and Wittgenstein's thoughts have leaked into the world and appear in all sorts of ways and maybe that influences some thinker later on the chain back to, you know, the person who had those ideas, it's not clear, but it's just part of our cultural development. You can't directly

Lewis Griffin  1:18:00  
We're swimming in it.

Lewis Griffin  1:18:01  
there's not a citation, you know, it's not like that.

Reuben Adams  1:18:05  
Right, I see. And what about these thought experiments around sort of consciousness and, like, you've got the Chinese Room thought experiment.

Lewis Griffin  1:18:12  
I think those, I think, again, those are really, you know, suddenly, suddenly, they're really meaningful, real scenarios that the, you know, the Chinese Room to, you know, we got this thing, this thing is like, a Chinese Room, a large language model, you, know, it was it, the scenario in that example, hasn't really changed and everything about it is still kind of correct.

But it's just much more immediate and relevant and fresher now, because you look inside this thing, you chat away and think bloody hell, it seems like it's really understanding things, what's going on inside, oh matrix multiplication.

And it's exactly the same thing that is raised in that problem. So and, and it wasn't, it wasn't resolved when the problem was raised, I don't know if anyone's really ever resolved it through all the conversations been had since but it's it's the right, they're asking the right, they're raising the right puzzles, aren't they?

Reuben Adams  1:19:17  
Yeah for sure. We have these like thought experiments ready to go now that we actually have the systems.

Lewis Griffin  1:19:22  
Yeah. Yeah. I mean, it doesn't, it doesn't. It doesn't provide you with a guide how to answer them.

Reuben Adams  1:19:29  
No, no. And I think a lot of times people have sort of been led down the garden path with philosophy of AI, and sort of made conclusions that just turn out to be wrong, and they just get falsified by empirical evidence.

Lewis Griffin  1:19:41  
Yeah, I think that's that's right. That the existence of these technologies may. Yeah, yeah, maybe.

Reuben Adams  1:19:54  
So you say you are an optimist. Are you optimistic about AI in general?

Lewis Griffin  1:20:01  
Oh, I don't know!

Reuben Adams  1:20:05  
Are you excited for the future?

Lewis Griffin  1:20:07  
Oh, I don't know about that either. I'm you know, I'm no, generally I am optimistic and generally, but with this with this AI thing, I am much more, I have much more kind of concerns than I've ever had for anything before. But maybe it's just being a bit older and bit grumpy thinking "well, things weren't what they used to be."

But I am concerned about its impact. I'm not particularly I'm not particularly concerned about this influence thing. I kind of think it will all muddle through, and there'll be problems and solutions and things like that.

And I also do this work on crime with AI. And again, I think we'll kind of muddle through that. And there'll be problems and solutions. What I am worried about is it's kind of an effect on us as humans about the sort of reduction in status, that there's the things that we thought we were good at, gave us a purpose, there are these things which are better than us.

And we don't have this purpose, and what will that do to us? And what would it be like growing up in this world thinking, what am I going to do with my life when there are these things? That that, that worries me? And that's what makes you think I hope this is a good idea? Because I'm not convinced it is.

Reuben Adams  1:21:38  
Well, it sounds like we've got a lot to talk about if you come on the show again.

Lewis Griffin  1:21:41  

Reuben Adams  1:21:42  
My guest today has been Professor Lewis Griffin, thank you for coming on.

Lewis Griffin  1:21:45  
Thank you. Pleasure.

Reuben Adams  1:21:49  
You've been listening to Steering AI from the Department of Computer Science at UCL subscribe for future episodes, join the conversation on Twitter with the hashtag steering AI. And follow us on the account @UCLCS for announcements of future guests, and your chance to suggest questions. Thank you for listening, and bye for now.