UCL Centre for Artificial Intelligence


Episode 3: Chris Watkins

How far away are we from creating AI systems whose capabilities rival our own? And would doing so pose an existential risk?

Listen to episode

MediaCentral Widget Placeholderhttps://mediacentral.ucl.ac.uk/Player/91IbCgGA

About the speaker

Chris Watkins - a man wearing glasses in a red jumper smiling at the camera

Chris Watkins is Professor of Machine Learning at Royal Holloway, University of London.

Professor Watkins introduced the hugely influential Q-learning algorithm during his PhD at Cambridge, which, when combined with neural networks, ultimately led to DeepMind’s incredible Atari playing agent in 2013.

He has done fascinating research in kernel methods, the role of communication in pandemic modelling, and information-theoretic analysis of evolution.


: I don't see how you don't get AIs, there'll be enormous incentives to build AIs which are capable of doing things in the real world. And in order to do things in the real world, well, you are given an overall goal.

But then you have to fill in all sorts of sub goals along the way. And so there will be AI actors with goals and sub goals, and extremely capable. How can you then be sure that what they are trying to do is what we want them to do?

Hello, and welcome to Steering AI, where we talk about how AI will shape the future and how we can steer it along the way. I'm your host, Reuben Adams, PhD student in AI at UCL. Today we're talking about the longer term.

How far away are we from creating AI systems whose capabilities rival our own and would doing so pose an existential risk?

I'm delighted to be tackling this topic with the wise and captivating Chris Watkins, Professor of Machine Learning at Royal Holloway, University of London. Professor Watkins introduced the hugely influential Q learning algorithm during his PhD at Cambridge, which when combined with neural networks, ultimately led to DeepMind's incredible Atari playing agents in 2013.

He has done fascinating research in kernel methods, the role of communication in pandemic modelling and information theoretic analysis of evolution. Professor Chris Watkins, welcome to spearing AI.

Hello, um, thank you. That's most kind. Yes, we all started thinking more about AI recently. And the way I put it to myself is that the future seems to have suddenly got closer.

Things which we thought of as science fiction, they seem so far away. Now, suddenly, we think there'll be here in a few years. And one has to start actually thinking about it, how it may arrive in practice, and what it may actually mean. And that's a very unfamiliar sensation.

: We're not used to progress going faster than we expected in AI.

I think so I think, well, neural networks were, you know, backpropagation was invented in 1987. And also several times was several times in the late 1980s, once or twice before, but not really be used. And this was immediately interesting.

It was interesting, because you could have networks of neural like elements, all implemented in software, that actually seemed to do something interesting. And that was there was a huge interest in neural networks in the early 1990s.

And enormous interest, which kind of died down further than other machine learning techniques, which seemed to be mathematically simpler and cleaner took over for 10 years. And, and then there was an enormous surprise with deep learning, training neural networks on large datasets just seemed to work much better than anyone had expected. And nobody knew why. There are some inklings of why now. But nobody knew why.

And immensely impressive and unexpected achievements popped out. For people who programmed their GPUs to train their neural networks. As we all know, this is going to be familiar to many of the listeners.

So that was one surprise, and most particularly in vision, identifying a cat in an image, for example, I can hardly begin to describe to a student now, how impossible this task would have seemed in say, 2005. How would you do it?

The cats can be in any position. So various, how do you even try, but Andrew Ng did it with an enormous training dataset of images of cats and not cats taken from the internet, which of course, was a wholly new source of training data at the time. So that was one enormous surprise.

Oh, that was a surprise in another way, as well. Because for the first time, we had interesting computations for these vision neural networks, using a computational method that looked well not a million miles away from what's going on in the brain. You have an enormous parallel computation of with millions 10s of millions or hundreds of millions of elements and computations, which did something interesting by recognising a person in a very short time.

Now how the brain could possibly do this. This has been completely unknown for 100 years. We knew that neurons worked by spiking by sending each other electrical signals.

We knew that neurons worked by sending these electrical signal through synapses, which were like adjustable switches, adjustable resistors, if you like, or threshold units, and we knew that the brain did interesting things in this way, and we knew that the brain had to do it in only a few 100 cycles, because each cycle of neural computation lasts several milliseconds, you probably know better than night, how many milliseconds?

: I think that sounds about right. Yeah, a few milliseconds.

: So an extraordinary achievement, like recognising the faces of your friend under arbitrary lighting conditions as they come towards you. The brain, we do it effortlessly, or subconsciously anyway, was quite a lot of energy is put into this, within 250 milliseconds, 500 milliseconds, and no one had the faintest idea how that happened in the brain.

And now, well put it this way, deep learning is producing ideas which are filtering into neuroscience, both ideas and techniques into neuroscience. So those are two very strong trends. And then well, I the reason we're sitting here today is ChatGPT.

Actually, there had been rumblings of large language models since about 2007, first papers, only paper by collarbone Western and also probably by other people. And then the late 2000 teens came GPT two, which then seemed an enormous neural network, hundreds of megabytes trained on an enormous I gigabytes of text.

And with care and attention to tender loving care, you could coax her out of it little stories and sensible pieces of paragraphs of text. about unicorns living in the Andes, which could speak English, famously.

: I remember being completely blown away by that.

: I mean weren't we impressed? It's the seemed incredible. And it had happened, really, with very little stuff. Translation, which is also very difficult had also been been enormous progress and translation. And this came out of the work on translation. But it seemed absolutely extraordinary that a neural net could generate language.

Now, the reason why that seemed absolutely extraordinary, is that since Chomsky in 1956, seven, the who famously argued that language was based on transformational grammar. Linguists and computational linguists have been devising grammars, descriptions of the syntax of language, and it's so nearly succeeded.

I mean, you can download the Stanford core NLP parser. Today, and you can run sentences through it, and it'll pass them and some of them it gets right in some of them it doesn't. And actually, there are really simple ones, which it still doesn't get right.

: There's the classic example sentence of time flies like an arrow, which, when fully parsed, has multiple different meanings that you wouldn't expect.

: Yes. And the amazing thing about parsing, is it. I mean, there's no dispute that sentences have a kind of tree like structure when you look at them. But somehow, generating these tree lights structures, in a plausible way was very difficult.

If you made your grammar big enough with enough rules to be able to account for nearly all of language, it generated silly things as well. And every sentence you looked at, like Time flies like an arrow had 10s, or sometimes hundreds, all enormous, unexpectedly huge numbers of different passes of different tree structures, which you could attach to it.

One of my favourite sentences, again, is hospitals named after sandwiches killed five, which was a newspaper and actual newspaper headline and the times, they must have rather enjoyed doing that. I like it because GPT four gets it wrong.

Reuben: Haha okay.

Chris: GPT four gives a long explanation of how hospitals could be named after sandwiches. That's perhaps as part of a Sponsorship Agreement or whimsically named after sandwiches and the hospitals don't kill people. But maybe people died and it wasn't clear.

: Well, there's a Reuben College in Oxford, and that's named after a sandwich, so it's not too unlikely.

: Anyway, so people thought that you needed to understand syntax. Semantics was very, very difficult, as well, and also the pragmatics and put these together in some complex system. And a neural network collection of matices was spitting out complex sentences. And then of course came ChatGPT three.

Chris: The step between GPT two and GPT three is basically a scale-up. Now of course, this is a tremendous engineering achievement, and I'm not an expert on all the things that were done in order to be able to train a vastly larger network.

You have all sorts of clever engineering involved in a large team of people doing it. And then the step from GPT three, this pure text prediction to something which is actually useful, like ChatGPT, involves a great deal of extra training to try to modify which sentences it will produce

This is the RLHF.

: The reinforcement learning from human feedback, yes, it's not quite my kind of reinforcement learning. And this is perhaps less well, there's now an immense amount of research on fine tuning. And great many groups are trying to improve this.

And ChatGPT, at this point, was enormously shocking, most shocking to people who have been following AI, I think, the general public Oh, wow. Okay, talk to the computer. Okay. This time, it's not an act of pretending to be a computer is actually computer but okay.

Now, such a sudden improvement, producing sensible and apparently insightful replies to questions in a in a useful way, on any subject, you can name this, this has never been achieved before. To me, it's still magical. And I have little intuition high does it. But let's look systematically at just why this is so surprising, or why it's so shocking.

And why this really makes one estimate that the future's got closer. The first thing is it's producing language. And obviously, it wasn't given a syntax. It's doing it in a way which is massively connectionist in manner, which was previously unimaginable, and a complete surprise.

It's come through engineering really, rather rapidly. I think scientific and engineering progress is really not normally as easy as that. It didn't come as the result of some moonshot programme.

In the first paper on describing ChatGPT, I think was something like large language models show one shot learning, I think they didn't really fully understand they had. The next thing, which is shocking about it, is its scale. It's been trained on a superhuman amount of text.

But this means that you can talk to these large language models, about any subject imaginable from ancient Roman emperors, to recipes for obscure vegetables, to any mathematical topic you want, to how to programme in Python.

: That's what I use it for mostly,.

Yeah, me too. I find it very useful for that. Well, we'll come to that in a moment. We'll come to that. And so it's kind of superhuman. So it's very hard to say that this is not an AGI in some sense. It's certainly artificial. It's definitely general, it's more general than me.

: In breadth.

: In breath, and and in some sense, it's sort of intelligent. Now, I don't think there's any good definition of what intelligence means. And I'm not going to get we're not, we're not going to try to define intelligence. 

Not today.

: Not today. That'd be really hard. And then it is what I would call super usable, you can just casually ask questions in ordinary language, and will often give you a sensible and useful reply.

And then I think a very serious and quite significant and scary thing is that although the big models are not open source, these things are open source straightaway.

There are really quite capable smaller models, which you can run on a desktop machine, if you've got a decent GPU, and you can train them yourself. And so this is really democratised, by open source. It's spread worldwide. It's downloadable. There's a company hugging space, which specialises...

: Hugging face.

: Hugging face, erm shout out for hugging face. Trains models, and writes software to, develops software to make them easy to use. It's extraordinary. And this means it means we've made made very sudden, unexpectedly rapid progress.

We're starting to connect these large hybrid models also with neuroscience. In some striking early results, it's early days very early days yet and the development, the technical development, although very large, what they call frontier model, the largest models will by definition, you can only develop those if you're a company with deep pockets and you can pay for a lot of computation.

But there are an awful lot of small companies and even private small organisations and private individuals who can either take these models and tune them or develop their own models using unimaginable amounts of computation or, sorry computation unimaginable a few years ago and

Chris: And the computation is going to get faster. There's a great deal of engineering effort now devoted to speeding up these models. I mean low precision is perhaps an easy win, LoRA low rank adaptation and all sorts of techniques that people are bringing in. And so the process of speeding up the training is happening. And in the medium term, multiplying two matrices using a GPU Graphics Processing Unit computer.

This is an immensely luxurious and expensive procedure. There are much simpler chips that you can design, if you're only interested in matrix matrix multiplication. And in particular, if you have a particular matrix you want to multiply by, you can have a kind of crossbar chip. And this takes very little energy. So this means that these already very capable notice I'm not saying intelligence, I'm saying capable systems, will soon be able to be very widely deployed with low power mobile phones, on vision systems on drones. And we'll get that.

Chris: So you have multiple factors, which are really accelerating progress. And of course, the other thing is that this is now by far developing AI is now the most popular topic in science.

: But the wide scale deployment of these large language models that you can fine tune yourself for whatever purpose you wish, surely, this is a fantastic thing that people can now use these to automate various parts of drudgery, and frankly, just have a lot of fun with them.

: You can certainly have a lot of fun with them, I mean they're hilarious. I'm not entirely sure they're going to be as useful as people think. I mean, the problem is that these models are not reliable. And it's very hard to make them reliable.

Chris: And personally, I suspect that is for fairly fundamental reasons. Now, I don't want to say that language models will never be reliable. But I think we need some conceptual advances first befre we can make reliable systems based on them. And it's very important. It's very, when you're talking to these language models, I mean, I always, I find myself typing please and thank you.

Chris: Because it just feels to me more natural and I'm more comfortable doing that.

Chris: I have no idea as this improves the replies or not. I know, that the language model really doesn't care.

Chris: It's my prompting system. If you wouldn't, I don't say if you wouldn't mind when I'm talking language model. I'm not very good at prompting them.

: Well, it all just depends on whether in the training corpus polite users were getting better answers.

: Absolutely, yes, it's not. The model doesn't care. It's it's just finding, stimulating it to generate the text that you want.

: Which cognitive tasks, if any, do you think of out of reach for large language models.

: In terms of capabilities, it's always going to be tantalising because how the things are developed, you set up a battery of tasks, battery of questions and responses that you want, so that you can have some sort of objective evaluation.

So you pay I don't know how they do this, I assume they pay low payed people to create 10s of 1000s of these things. And so each of these sets of questions and answers is a benchmark, and then you evaluate your model on these benchmarks.

Okay. Now, I would predict, we're just going to get better and better performance on these benchmarks. And so it'll be it'll always seem that you can make an engineering progress.

But with the present design of large language models, I think that there are some quite big differences between human and language model cognition, if you like.

Chris: Let's just take the most, what's the most obvious difference? The most obvious difference is that right now, as we're talking, we're generating great training data for language models.

Chris: Every word we say is gonna go into, what we say is training data for these models. We're training the models, the models aren't training us. Okay. So why is what humans speak and write training data for language models? It's better in some way than what the language models, write? Well, what's the difference?

: That's not quite always true. So there's this paper called Constitutional AI. Where you, you get, so you get some question for the AI for the large language model, and it responds.

And then you have a constitution, a set of rules, and the language model is meant to abide by. And then you ask a separate instance of the language model to evaluate what the first output was, according to these, and then rewrite it so matches the constitution and then use that for supervised learning with the original language model.

: I see that that's very ingenious. I mean, I think there was a lot of work on getting large language models to talk to themselves so that you use an internal dialogue.

And the basic philosophy of this is that a large language model contains a vast in some sense contains a vast amount of what you might call knowledge. And when you're asking a question, if you give yourself just one shot of asking a question and getting an answer, you're not using the full capability of the model, you should be able to assess the answer in the light of the other knowledge the language model contains. So you should be able to ask the language model about it.

Chris: I think one natural strategy is to ask the language model to generate corroborating questions. And so you set up systems with internal dialogue, which means that you're accessing a greater variety of knowledge within the language model. And you're using the very generality of the natural language interface to elicit relevant knowledge and then combine it again to try to get a better answer.

Chris: And I think that's, that's very interesting. But I think there are still two differences between people and language models. The first difference is that our knowledge of language is rooted in our experience growing up and using language to refer to the world.

Reuben: It's grounded.

Chris: It's grounded, and language models at present are not grounded. I mean, people are starting to ground them, but they don't develop language in this grounded way. The second thing is that we use language for communication. And this again, at the present time, it seems to be quite different from what language models do.

I think it's actually worth going into this. So question is, how would you define what we mean by communication? What does it mean to communicate something? And so your first attempt, if you're a philosopher might be to say, Well, I've communicated something to you, if I cause you to believe it.

Chris: And that feels pretty good. Before you don't believe something, or you don't know something, and afterwards you do. And I've caused you to know it. But is there a case of doing that which we would describe as not communication? So this philosopher, Oxford philosopher, this is going to sound so 1950s. He said, well, just imagine a village back at the dawn of language for the wise man of the village.

And there is a path leading on the side of a cliff, and the young people of the village take this path, but the wise man of the village considers it to be dangerous. And he wishes the young people of the village to believe it's dangerous, so they won't take it. So what does he do? Well, he gets up before dawn, he goes out, and he kills some wild beasts, and with a club and arranges them as if they have fallen from the path.

And so when the young people of the village wake up somewhat later in the morning, and they get up, and they say, oh, look at, these wild beasts, they must have fallen off the path. It must be really dangerous. Maybe we shouldn't walk along it. It's the 1950s young people.

Chris: Now, he's the wise manager that has caused the young people of the village to believe the path is dangerous. Would it be right to say he's communicated? Well, let's consider a few generations later, the next man of the village wishes to communicate the same thing to the young people of the village.

But he goes about it in a slightly different way, as you might take her the corpse of a rabbit, which is killed, and sort of gesture and drag it towards the bottom of the cliff. And the young people of the village say, What's he trying to do? That's very odd.

And then they eventually realise and they understand that the wise man of the village wants to convey to them that the path is dangerous. And he can do kind of literally with large dead animals, or small dead animals, or even drawings of dead animals.

Chris: But why did the young people believe this or not? What they do is they recognise his intention to communicate that. And this is mime, this is a thoroughly different type of communication, I have noticed that now the young people need to have some faith in the wise man of the village. If they think he's completely foolish, they will not believe him. They'll say He's going on about that path again.

They need a model of how his mind works.

: So now this is between mines. So they're recognising intention. This is a completely different game, completely different type of inference.

Chris: But it's still not linguistic communication. What about language? Well, communicating with language is different again, because now we've got a conventional symbol system. And now you don't need to put nearly as much effort into the mime. You can just say the path's dangerous.

Chris: And the only reasonable interpretation of these words is that wants you to believe the party is dangerous. So this takes a lot of the effort and the uncertainty and so on, out of recognising intentions to communicate. And it's almost possible to think that words simply have objective meanings and maybe they do.

Chris: But this is a much more solidified standardised sort of communication, but it's still ultimately in the framework of recognition of intention.

Chris: And what large language models do they didn't have intention, they spit out text. At the moment. And this seems a very fundamental difference between, they really are in human. Now of course, we're going to be treating them as in human as human. We are going to be thinking they have communicative intention. People will be falling in love with language models.

: I think they already are.

: Probably.

Chris: So, but these seem pretty fundamental differences.

Chris: Let's take one more fundamental difference. Um, people call, people call hallucinations. These large language models just don't seem to be true. They just seem to say things which aren't true. All the time. When I asked about programming in Julia, which is a less known programming language than Python, at least 50% of the of the answers are gloriously and intricately wrong.

The first time this happened, I couldn't really believe it. The thing suggested a new syntax for defining types. I thought, wow, that could be useful! I don't quite see how it works. And I spent two hours searching through the Julia documentation and trying to. Eventually I realised that it had been completely hallucinated.

: But a lot of them are very useful hallucinations, they might be useful for the Julia designers to then incorporate that syntax.

: I was thinking of emailing them and suggesting it. Maybe it maybe it's been suggested, but it's never been implemented.

: I was trying to draw a rectangle once and have it filled in rather than just an outline. And so I asked it, and it said, well set the filled argument to true. So I tried that, and then it says, unrecognised argument filled. So it just made up this argument, but that argument should be there!

: Absolutely! Very good.

Chris: So let's just think about, maybe why these, I mean, essentially, these language models are not grounded and they don't have purpose. And this is one reason why they, hallucination is a terrible word. It should be confabulation.

Chris: A common complaint is if you get it to suggest some references on a topic, then it'll sometimes give you a very plausible looking references, which are papers by well known researchers, which do not in fact exist.

Chris: So what's happened? Well, the language model is drawing this reference from a probability distribution of plausible paper titles. Because it's seen hundreds of 1000s of paper titles, it know what it knows what's plausible. So it's doing generative modelling paper titles. You can't blame it.

Chris: The trouble is, in some contexts, that's exactly what it wants to do. And in fact, during its training, that's exactly what it wants to do. But it's asking rather a lot for the language model to recognise during its training, that in every paper, dated 2023, the references will be an earlier time in 2023, or earlier underneath, and they will actually exist and there'll be things it's already seen. Oh, actually, it won't have already seen them all unless it's training data is in chronological order.

: And even if it has seen them all, it's a lot to ask for it and memorise all of them.

: Precisely. So, there are some contexts exactly, you'll have some contexts in which you want it to draw the paper title, from the empirical distribution, the set of paper titles it has seen. And there are some contexts in which you'd want it to make it up.

Chris: And which it does will depend on the very subtle ways on the intention. So there'll be all sorts of situations like this, where it will confabulate. And people are suggesting you can cure confabulation by having it look up papers. Well yes, but this doesn't solve it confabulating the Julia type system.

: Unless it has the Julia interpreter.

: Unless it has. So there'll be an awful lot of engineering. And I think there's a phrase I rather like, the jagged edge of capability of these models. Oh, and then a final difference. These large language models, they still can't do long multiplication.

Now they've read every book, every little child's textbook on long multiplication, which has ever been published. 10,000 of them probably. They still can't do it and they don't know they can't do it.

And they didn't really know what long multiplication is they can stochasticly parrot, a definition of long multiplication. But a child can learn multiplication, long multiplication from one book, from practising and thinking about it and realising that there is a self consistent, right, it's a system of rules which produces a self consistent system of multiplication, which the child then understands, then the child can do.

And I'm sure people are trying to implement systems of reasoning like this. But that's another, that's a fairly fundamental advance, which maybe we haven't yet made.

: Yeah, so this slower deliberative thinking sounds like something that language models are always going to find extremely difficult. But I wonder if you combine language models with lots of pre existing systems that can do these things like the Wolfram Alpha plugin, or if you give it access to interpreter so it can try running code and you suddenly have a system that can do almost anything What would be left then?

Well, maybe, I mean, I never said they always won't be able to do it. I said the current language models can't. And so it's, it looks so impressive. I mean, but I quite agree with you.

Chris: I'm sure that I would expect that fairly soon. Actually, we will have deliberative systems of this type. I don't know how soon. But I think that design principles for deliberative systems, well they may arise by accident out of combining large language models and other systems. Or it might need a new theory, one or the other. I don't know.

Chris: But that seems quite a fundamental advance. It hasn't quite happened. But I wouldn't be surprised if it did. It wouldn't be surprised if we woke up, a month from now, well one day, a month from now, and you we found that suddenly language models really understood truly understood long multiplication. But, but that hasn't kind of happened yet.

Chris: And what other differences are there with cognition? Well, I find animal cognition is really impressive. That little birds, the size of sparrows, managed to spend the winter in England eating, finding without fail every day, 30% of their body weight and food, needing it and avoiding predators and flying around and not bumping into things mostly.

And we're not surprised if a sparrow survives an entire winter. Well, we can't build a robot that does that. Animal cognition seems extraordinary, robust and resourceful, conceptually limited in many ways.

It's also immensely rapidly learned. My favourite examples are birds in Australia, called mega podes. And these have a remarkable life cycle, where the eggs are incubated in piles of steaming manure.

And when the egg hatches, the chick is quite highly developed. These birds are related to chickens, but they're more material and they're born, they pack their way out of the egg, they climb to the top of the manure, they stand up, they never see either of their parents, they stand up look around, rather surprised to see the world and run off into the bush.

And as far as anyone can tell, they have a fully adult behaviour within 48 hours. Now these have, these are not taught by their parents, they're not imitating their parents, they're doing it on their own. They are born with a little supply of egg yolk inside them, which is that first meal, it's not clear that they do a lot of reinforcement learning, I think they peck their feet a lot. And then stop. So that feels like reinforcement learning to me.

: So all the reinforcement learning has been done during evolution.

: Right. And so somehow, the nervous system has been designed for very rapid development and learning. And these birds are not altogether exceptional. They're not so different from other birds, I think they don't know about cats, which is rather a disadvantage.

: Are these New Zealand birds?

: I think they live in Australia. And you really don't want to have a brush Turkey in your back garden, because it will scratch your entire back garden and possibly your neighbour's as well and an enormous pile of manure, which it will then guard.

: So once large language models are incorporated with lots of plugins that can do the kind of deliberative thinking that they're currently bad at. Do you see that kind of integrated system as a form of nascent AGI?

I don't know I, I have a suspicion. It's, it'll take some basic conceptual advances. But I think those will happen. I'm not sure that a large language model talking to itself is really quite going to do it.

But that's just my opinion. I mean, little birds don't talk to themselves, as far as we know. And they, they are very resourceful in surviving in a pretty hostile environment. So they have capabilities which our current AIS and robots don't have, it's not general intelligence. And we may be a little caught up in our own verbal intelligence.

Isn't I mean, it's a very nice question. To what extent do we have an ability similar to a large language model? You just start talking and then listen to yourself? Well, you can learn.

I learned quite a lot listening to myself talking, lecturing where I think, oh, wow, that's quite interesting. Very curious sensation. But it is as if one's reasoning to oneself through talking and then you cycle back over it right, and you talk again and so on and the process of writing of putting your words on the paper and organising it into thoughts, this is very powerful way of thinking. 

: You learn the logical consequences of some of your beliefs.

: Yeah, well, you have to sort of make it consistent and develop it. And in partly writing is an extension of your memory, your working memory. So pencil and paper is very important. Keyboard screen is important for that.

But also, it's talking to yourself as well, I, that's what I find. And it's a very powerful way of making your thoughts much better and more consistent and more extensive than just talking to yourself. So. But I'm not.

Certainly, supposing you have a self consistent understanding of arithmetic, you fully understand how to do long, you really understand how to do long multiplication, carrying, maybe you understand different ways of doing it, that it's the same multiplication, some of you split numbers into 10s, and units and do it all separately and add it up, if you do long multiplication, and you've thoroughly understand how to add and multiply and divide and subtract fractions, and all that, supposing you have that understanding, could you understand have this understanding by talking yourself as a large language model?

Well, maybe. But maybe we all know people who do this. And actually, actually, you're terribly prone to error, talking, you can say things which sound plausible, and they aren't really plausible. And we all know that there are these systems of if you like, verbal discourse, these bodies of opinion.

And if you hear someone who's just arrived at university, and they start talking about the internal contradictions of late stage capitalism, you may suspect that if you probed hard, they may not actually have a fully self consistent and worked out theory of what all this is.

: No comment on that.

: But they can talk about it. I know I didn't.

: Well, let's take mathematics, for example. It's true that if you just wonder to yourself about mathematics, and come up with ideas that you can be led terribly astray. But what keeps us grounded is checking with axioms and deliberative thinking. So could you imagine a large language model paired with a theorem checker becoming a theorem and proof generator?

Yes, I mean, something like that. I mean, my guess would be is that we need some theory of self consistent deliberative systems and self consistent deliberation. And that simply hooking up an LLM to to lean or subroutine, so lean or something, probably won't do it. But maybe you can get a long way with that. Maybe you can get further than people with that.

: I can imagine training it like a game like Alpha zero.

: Could be I mean, so I'm always looking at the differences between intelligences and capabilities. And I can't decide define in a general way what intelligence is, but a capability. Well, if the thing can prove non trivial theorems that's capable.

: Yes, yeah.

And so if robots can, if a robot can bring me a cup of coffee, when I asked it to, avoiding things on the floor and binding the coffeemaker in the cupboard, if necessary, that's capable. Is it intelligent? Intelligent is such a tricky word.

I remember long, long ago, when I was in a computer shop in Cambridge, and I was looking happily at dot matrix printers, the sales assistant said to me, that one is fully intelligent, of course.

Chris: So I was rather impressed.

: We had it back then!

: What he meant was that the dot matrix head would return to the left hand side if there was nothing else to print actually travelled the whole way along the line. That's a capability. That's kind of thermostat level intelligence.

: So I think what you're touching on is what Holden karnofsky has coined PASTA, which stands for process for automating scientific and technological development.

And he talks about this because it sidesteps some of these thorny questions you've raised about what is intelligence or general intelligence, and instead focuses on what he believes to be one of the key capabilities that could transform the trajectory we're on in a potentially dangerous way.

Once you have a model that can automate the process of science and the process of technological development. A subset of that is AI development. And you could end up with, or he believes you could end up with a runaway system.

: Right. So we've been talking about today and maybe a little bit about tomorrow. This is really the day after tomorrow. But really, it's the logical extension of these ideas. I don't see that there's anything that, any capability that people have, which AI computer system couldn't sometime have. So, yes, I mean, there's a lot of noise at the moment about AI being essential for science.

And what they mean is a rush. Now we can predict 200 million protein structures and whether they'll stick to each other. Well, that's marvellous. But I wouldn't describe this as really intelligent. That's a great capability to have, but it's fully under our control. Now, for combinatorial optimization of things like circuit boards or designs, you can use reinforcement learning and optimization.

That's a major advance, but it's better than integer programming, right? Or defer set. So what people are talking about now is quite different from that. Now that is taking over the central intellectual task of theory development information, and automating the whole cycle. I mean, it would be very unwise to say this is impossible, that's a long way away.

But then if that happens, then you do have this extraordinary possibility of runaway improvement, which is very, it's very hard for us to think about.

Reuben: So some people have argued that this runaway system, or even before that, just that, once we get to sort of human level, capabilities, that this could pose serious risks to humanity as a whole. Do you take that kind of worry? Seriously?

: Um, yes. Okay, let's. So what we're talking about, just to be clear, we are talking about things which don't exist today, that today's today's large language models, they may be used for some bad purposes, but they are really not intelligence in the sense.

It's not going to be tomorrow, but say, in 10 years, or 20 years now within the horizon of people, plenty of the majority of us now living. Maybe we can come to systems with these types of capabilities.

So I think a lot of people are very uncomfortable about speculating in this way, because there seem to be no rules, right? What can we reasonably say about these systems? Or what may happen?

In a science fiction world, which, in some, maybe some fundamental conceptual ways hasn't been developed yet, after maybe one, two or three conceptual breakthroughs of the order of reinforcement learning? Maybe AI systems will become much, much more capable than they are now?

: One, two, or three?

: One two or three? I don't know how many, one, two or many. And I think a lot of some people enjoy speculating about far future, and are comfortable doing it. But why should we believe anything they say?

Why should they people believe anything we say we speculate about it, is that some way in which we can reasonably talk about it, are there are some things, which we can definitely say about this far future, which has become closer, and give some reasonably convincing arguments as to what may or may not happen.

But because I thoroughly respect some people, so just see, speculation of this type is pointless or very uncomfortable. And here, I think Geoff Hinton has really well, the first thing I'd like to say is that the reason why I think this is coming, is because there is a long arduous route to AI, from reverse engineering the brain.

And this is very slow, it's very difficult. The brain is immensely complicated. We know vast, huge numbers of facts about the brain, except how it works. We don't know the principles of how we think, and how the ideas we develop are correct or not, and so on we just don't know it.

We don't know how that's done in the brain. We know something about the vision system. We know little bits about different parts of the brain. I'm, I hope that doesn't offend neuroscientists too much.

: But this sounds an extraordinary long way away. I mean, we can't even do C. elegans. Yet, this worm of only 1000 cells.

: That's true, even though we've known all of its synaptic junctions by name for it. But nevertheless, at the moment, there is a growing connection between deep learning and neuroscience. And my impression is I'm not an expert in this area.

My impression is that the flow of ideas and techniques is from AI and machine learning into neuroscience. Both techniques for analysing neural data and in terms of what they desperately need are models of how the brain might work.

Now, imagine the difference if we actually had a decent model of how the cortical column worked, and what it does, and suddenly the world will change quite right rapidly. It is, as you say, a very long road. But there's no reason to say that we don't get to the end of that road at some point.

So what the machine learning community is trying to do is to take shortcuts to highly capable systems. And whether they'll succeed in doing that, and whether they'll discover completely unexpected systems along the way. We don't know.

: So where does the where does the risk come from? Surely, automating scientific development and technology would be a tremendous boon to humanity?

: Well, yes. So far, technology has been good to us. We are most but for the most part, we're sitting very comfortably in a room with lights and heat and, and we, well, let's, let's look at some near term risks. And from that, try to go to some longer term risks.

One near term, I think we're seeing in the news now, is drones are starting to be used in war. At the moment, the drones are rather on the level of aircraft in the First World War. During the First World War, at first aircraft were used purely for reconnaissance and artillery spotting, and other pilots would wave to each other from each side as they flew over the trenches. Then, the pilot started shooting at each other with pistols.

Then there came the crucial invention of a machine gun that fired through the propeller came in two stages. And then aerial war was born. Bombs were taken by hand and thrown out of the plane. So at the moment, the drones don't contain well as far as we know, as far as it's been reported, so far, they don't contain much artificial intelligence.

I think machine vision systems probably take up too much electricity and reduce the drone's range and increase its weight. But the intense development of weapons involving some AI is, I think, unstoppable. Simply because machine vision is now open source, and people will be doing it. And this gives the possibility of some very historically rapid developments of systems for automatically killing people. Why does that matter?

Well, if you can build a lot of machines that automatically kill people, then this means that you can kill people without being risking being killed yourself. And this changes the tactical balance of war, it changes a lot of things. And this is quite sudden that no, this is, is this an extinction risk? No, but it's something which might change our society quite a lot.

: One could still believe, though, that if you advanced technology as a whole, then society will be better off even if it leads to.

: Absolutely. Let's consider another or the plausible in state, a situation we would wish to avoid. Which is that throughout history, there have been plenty of tyrannies of tyrannical regimes against which rebellion is impossible. There control is to complete.

So, Sophie Scholl, famously in the 1940s and 1943, I think in Germany, said somebody was making a start and went often distributed leaflets, but she was stopped. But there have been plenty of examples of regimes with ordinary human characteristics where the people in power and in control wish to suppress any possible dissent or rebellion.

Well, modern surveillance technology, and particularly AI surveillance technology would give extraordinary new possibilities. I mean, it would be awful. You could have a society where you had to carry around your mobile phone and keep a charge and listen to you all the time. And if you said anything wrong, you could get reported.

: That would be possible already.

: It's possible already. And the problem in applications of AI, AI can be applied at scale, you can apply one little version of intelligence or one capable system at scale. And you replace perhaps many, many people who would have to do the same thing.

The problem is that, er, now of course, your system may be better than the people, maybe more accurate and may be less biased. Now the people might be lazy, malicious biased prejudiced, taking making wrong decisions, falling asleep and inattentive and all those things and your AI isn't.

And of course you can test the AI to see if it's biased or not. As Gary Cohn has pointed out and with a person you can take the bias out of the AI whereas taking the bias out of a person, very hard.. The trouble is you can also make very big mistakes.

: Because of the scale.

: Because of the scale you can replicate at scale and you can make make very big mistakes. This is this again seems is new.

: So this is a risk of misuse.

: Yes.

: Geoffrey Hinton [correction: Yoshua Bengio] has written this blog post called how rogue AIS might arise. What do you think about this idea that instead of just being used by humans, technology could escape our control?

: Um, I think it's, how can I say. I don't want to say, Oh, yes, that's very possible. And so this is very hard to think about. I don't see how we can be certain one way or the other, I don't see how we can meaningfully start giving probabilities one way or the other.

But Geoff Hinton is one of the people who has perhaps most simply and clearly given some thoughts about the future output and the future AIs that don't exist yet, but which may, and what we can reasonably say about these systems we haven't invented yet. And I think he's made three points. And they're all good. And they're all very hard to dismiss.

His first argument is that in principle, and in the future, AIs may learn and think, better than we do. The reason is that all of our knowledge and experience and ability, the way that our brains function, I think one of the things we do know from neuroscience, is that this is encoded in the strengths of synapses in our brain, and we have about 100 million million synapses, ten to the 14.

And the synapses, they're rather like weights in the neural network, and the inspiration for neural network weights. And somehow, the brain adjusts the weights of the synapses, and each synapse has to adjust its weights.

There's so many of them, each one has to adjust its weights from local information. So it knows about the neurons each side of it to the presynaptic and postsynaptic firing. And it knows about chemicals that come in locally.

And we know quite a lot about how synapses change that strength, but not everything. Now, what as far as we know, synapses in the brain cannot do is to compute a gradient of some remote performance index with respect to the strength of that individual synapse.

: They can't do back propagation.

: They can't do back propagation. They can't, they can't, not only can't they do back propagation, it's very hard to see how you can get gradients in any way, so the backpropagation algorithm may, in some senses be better than the algorithms that we have in the brain.

And a slightly wider point is not only is backpropagation better, but if we understand how the synapses change their strength in the brain, well, then the algorithmic theorists, and the mathematicians can get to work and think of better algorithms and evolution devised for optimising.

And Hinton points out that the largest language models now have hundreds of billions of weights, so they're less than 1% of the size of the number of weights in the brain. And yet, maybe those networks, they've been trained on superhuman amounts of text, and they kind of remembered it all.

And so maybe they contain more information. So maybe already, we have systems, where through this magic of gradient descent through backpropagation, we've actually stored more information than in the brain, and they're not as capable as the brain in many ways, but in some ways, maybe they're more capable already. So that's point one.

That's why we could have more capable systems. His second point. Um, he rather picturesquely phrases it as that some AIs are some networks are mortal, and some are immortal. What he means by a mortal network is one which is randomly constructed.

So that I mean your brain and my brain, the listeners brains are all pretty similar. But there'll be minute differences in random construction during the development. So we couldn't actually transfer any weights from any synaptic strengths from between our brains, because we can't figure out which synapses correspond to which, it's not possible to do.

So whatever you learn, dies with you. But with artificial neural networks, we have these great square matrices are rectangular matrices, weights, and we can transfer this cross and that means that one intelligent system can be replicated many times and copied and so on. So it has this replicability. And this is something again which seems to be superior to biology.

Well that's two strikes by which artificial intelligences we can reasonably expect them to be superior to biological, in the end. Then the third one, I think, is a little vague. But it's still a great point.

And the point is that if we are going to build AIs, and they are immensely capable, you know we have a we have a system which can think, we have a system which if you give it a body it could do things, well that'd be useful.

: There's already work on this.

: I think there is some work. It's called robotics, isn't it?

: But incorporating language models into robotic systems.

: Absolutely, Deep Mind has a very impressive recent work on this. And so if you can have an AI system, which, for example, you have a wonderful idea for a product you want might want to sell with, I don't know, singing coffeemakers or something. Then you want to have a factory and a marketing plan and a product design and everything. And so you casually ask the thing I say, I have great had a great idea for a singing coffeemaker, which wakes me up with my favourite operatic Arias, or whatever it predicts the music that I either like or don't like, or Sousa marches or whatever. And then it'll design you a factory, it'll look for the place to build it, it'll say how you've got to make them and what you've got to order and how much, it'll look up how much it costs and everything, all within a few seconds. Well the pressure to give AIS the capability to actually do this will be enormous.

Reuben: The economic incentive.

Chris: The economic incentive, imagine. So I don't see how you don't get AIs, there'll be enormous incentives to build AIs which are capable of doing things in the real world. And in order to do things in the real world, well, you are given an overall goal of building a singing coffee and coffee maker factory, but then you have to fill in all sorts of sub goals along the way.

And so there will be AI actors with goals and sub goals, and extremely capable. So how can you then be sure that what they are trying to do is what we want them to do? So this is a rather vague point.

And yet, if we have truly capable systems, they will be capable of creative behaviour creatively, coming up with creative solutions with new sub gaosl, and implementing them.

: What's so bad about subgoals?

: What might their effects be? We really don't know. There are plenty of examples of unfortunate decisions from human industry. What unfortunate decisions might there be from sub goals in war, I think you certainly want your drones to have sub goals.

If you are sending drones over to attack things, the thing about a ship based missile is you fire it over the horizon. So you can't control it all the way to the target. For quite some time, missiles are fired towards a certain area, and then select a target, because they can't be targeted all the way.

So these I don't again, see that there is a fundamental difference, conceptual difference between intelligent weapons and a minefield. A mine is intelligent, because it knows when it's been stepped on. It's not a very high level of intelligence.

But we already set up extremely dangerous and horrible, automatic things. And, as I said, the pressure of the open source capabilities during wartime using these things is going to be immense.

: With the thinking offee machine example, it seems like sub goals couldn't really be a problem there. I mean, we've come a tremendous way with natural language understanding.

And language models now seem to have a great deal of common sense. Surely, if you had a model that could generate factory plans, and contracts, etc, and you asked it, please make me a singing coffee machine factory, it would have enough common sense to understand what kind of sub goals you would be happy with and which you wouldn't be.

Yeah this type of argument is very appealing. The type of argument is we're imagining the intelligent system as if it's a person and because our only our only experience of intelligent systems as type is, is people.

And so many things which are obvious to a person, we would assume it would be obvious to planners, so on and so we assume they're going to happen. Whereas, actually, this system is a computer programme of some kind, and I'm sure that over an immensely complicated design process, immensely complicated training processes one would try to build in guards, guardrails and criteria, and so on, controls, so they produce good designs and develop feasible sub goals.

But ultimately, it's a computer system And of course there are people who do unreasonable things. And it's an inhuman system or a nonhuman, I don't want say inhuman that has connotations, it's a nonhuman system. And it's very hard to say a priori, whether one can make such things safe. I'm sure they will be marvellously safe until until they become dangerous.

: Right, but the time they become dangerous, we switch them off.

: Ah, it's, the, the goals that these things have, ultimately, I think it's a very strong argument that the goals or primary goals, that these systems would have would be set by their users. And of course, users can set bad goals.

So I find it implausible that such systems would suddenly develop a very bad goal like converting the entire world into paperclips, the famous example and then carry it through.

But it would nice to have proofs that that wasn't so. Obviously, the system would not be designed to do that the system would be outside its design parameters you really want wouldn't want that to happen.

And, again, there's a tendency because there only examples of intelligent systems, of capable systems are people, people have all sorts of animalistic motives, they have good motives, and bad motives, and so on. And these are really kind of rooted in what it is to be human.

Machines won't have motors in the same sense. It's an absolute fallacy to think that a large language model has any motives of this type. It doesn't have emotions, it doesn't have motivations, it doesn't have desires.

Reuben: What about goals?

Chris: Well, goal is a very difficult word. We can think of it as a, er, one sense of a goal is a goal that we experience, internally, we form it as a conscious goal. I want to become a millionaire. I want to travel in Central Asia. And but as Daniel Dennett argued, a goal, is a tool for explaining behaviour and describing behaviour.

Reuben: He calls this the intentional stance.

Chris: That's exactly that's right. And so, in considering AIs, you need to be very careful which type of goal you're talking about. Is there a planning system with explicit goals should be gradually represented in some interpretable and grounded way inside the system?

Or are we attempting to explain its behaviour by not knowing how it works, understanding or having a mechanistic explanation of how it works, what causes what, but instead trying to explain its behaviour in terms of beliefs and desires.

And we're quite good at explaining people's behaviour in terms of beliefs and desires. Because we're people and we know a lot about human you have a you have a lifetime's experience of human beliefs and desires.

Different people are pretty much the same, but some people are different, maybe. But an AI is thoroughly different. So it's very dangerous, or it's very problematic, in the sense of it's gonna be difficult or unreliable or, or deceptive, self deceptive to try to explain a capable AI's activity in terms of its beliefs and desires. We may we may make very wrong predictions. If we do that, because it's something not like us.

: I suppose the way we pursue goals is by reasoning about what we anticipate to happen and the various actions and then choosing the one that best leads to those desired outcomes. Whereas something like a reinforcement learning agents might not have a goal explicitly stored in its network, and instead just intuits good actions in one forward pass.

Chris: Well, or it may have set up, it may have approximated some value function got it quite wrong.

: Reinforcement learning works by well, some forms of it work by approximating some value function, and you want to take the gradient. Well, this is pretty dodgy. I mean, implementing deep Reinforcement learning is really hard, because your approximated value function is likely to be not nearly correct over the whole state space.

And, and yet, you're choosing action, or optimising you're choosing a policy by following a gradient or small changes in your approximated value function. So if your approximation is wrong, what your approximated value function could be pretty good with wrong gradients here now which will cause inexplicable actions since this happens all the time.

: The agent can only see a finite number of states and so learns the rewards in those states. And then has some implicit extrapolation of those to the entire state space.

And if we take the intentional stance towards it and say, This is what its goal is, we're implicitly assuming we understand how it would behave right in those unexplored states. Whereas we, in fact, have no idea what extrapolation it's working from.

: I think that's a beautiful way to explain it, yeah. I mean, taking the intentional sauces, to try to explain what an RL agent is trying to do is really hard.

: But you could take this two ways, you could say, Okay, we can't apply the intentional stance, at least in the limit to AI systems, and that's a good thing, because it means it will likely just flail around in environments that hasn't found before.

: Or is, I mean, I think this is one of the, personality I think this is one of the very big differences between animal behaviour and RL behaviour, in that animals are very capable, and they have very robust behaviour of a wide range of circumstances with very little experience.

And this comes from evolution. Reinforcement Learning agents with a, as you say that the fraction of the state space that we can actually explore in any practical problem is tiny. It's like a little line on a piece of sheet of paper compared to the size of the solar system.

And have you explicitly generated, generalise, as exactly as you say, how do you generalise? I mean, extrapolation is rather a poor form of knowledge discovery. So, is your extrapolation valid at all? Is your system going to do completely unexpected things? No one's really got no idea.

I think the worry is, is it going to remain capable with some goal that we didn't anticipate in those new environments? Or is it just going to fail around flailing would be great.

Chris: Maybe!

Reuben: Maybe in critical positions, it wouldn't be great, but better than perhaps pursuing a goal that we don't want?

: Well I mean, Reinforcement learning is more complicated than that, because it's not just about rewards and policies. A reinforcement learner also has to experiment. Now, of course, when you first write a reinforcement learning algorithm, you make it do little random experiments by randomising the choice of action slightly.

Well, that's all well and good. The trouble is that if, for example, Christopher Columbus had done epsilon greedy exploration of north, south east or west randomly, I don't think he would have crossed the Atlantic, which might have been a good or a bad thing.

: We don't need to shoot ourselves in the head to realise it's a bad idea.

: Absolutely.

: So in a debate earlier this year, Yann LeCun, made the argument that if it's not safe, we're not going to build it. The audience laughed at him, but I think he does actually have a point, that it's in no one's interest to build something that would wipe out humanity or cause huge risks to humanity, because it would cause huge risks to themselves. So why would anyone do it?

: Well, I wouldn't laugh. But I think we have already the example of the nuclear arms race, where, because of a game theory setup of competition between two superpowers, both superpowers continued building more and more missiles, they increased the power of the warheads.

And they did this on the basis of game theory, that they needed more missiles, increasingly large numbers of missiles with actually almost no limit, except expense, so that the other person the other power, could not do a first strike and destroy all their missiles.

So this is why we have today nuclear submarines with nuclear missiles capable of wiping out whole cities ready to fire and we still have missiles and land based silos. And this game theory had an effect and caused, I don't know hundreds of billions of dollars of expenditure in the creation of missiles which could and still can have, that is already an extinction risk as the result of game theory.

Now making and developing those missile systems took a lot more time than developing AI is likely to take. So what concerns me slightly about AI is that the rapidity of its development, it's software, you can disseminate.

There's in my view, absolutely no prospect of stopping this kind of development because of the open source. And we have these very powerful open source tools that are available now.

And people coming of age now, it may be really hard for you to imagine what life was like without open source. I'm sitting in UCL back in the late 1990s, I went into a computer shop on Tottenham Court Road, close by here. And I gazed longingly at a CD ROM, with Microsoft Visual C++ on it, which cost nearly 1000 pounds, and I seriously considered buying it

Reuben: Wow. 

Chris: Today, the range and sophistication of open source software is extraordinary. The availability of computing power is extraordinary. And so when breakthroughs are made, development's gonna be fast. That's another concerning thing. So to say, if it's not safe, we won't build it is simply absolutely false. We already did.

: So how are you imagining this playing out? So if you've got a government building some AI systems, what would they be using it for? What would be the risk that they're ignoring in order to get their advantage, their game theoretic advantage?

: Well, um, with nuclear missiles, prediction is a little bit easier because there are fewer missiles are not intellectually very interesting. They just do one thing, they go up, they come down and they explode, or they intersect each other. AI, this is more protean. It's very hard to anticipate.

I mean, there are some known knowns. There are known unknowns, but there are unknown unknowns. And AI is a far more. The point is that software is a much more is a technology with enormous ranges of uses.

And so one can think of a range of different, unpleasant and even think of, lots of pleasant things, and plenty of rather unpleasant things. Imagine the pleasant has things, that we have an all encompassing AI a bit like super Google, which you can ask it any question, you can ask it to do things.

And the AI manages everything, you will have a universal basic income. It's like sitting in a jacuzzi with robots that will happily bring you anything to eat or drink that you want. Oh, you're sitting in a jacuzzi that's 100 metres wide and 100 metres deep. Wonderful. What's there for the people?

I think this is just unimaginable. This isn't predictable. In many ways, there are many aspects of life today, which would have been unimaginable to people 100 years ago. And yet, we've hoped, so maybe we will. But the point is that AI is rapidly developable, rapidly deployable and potentially very capable.

Now, today there is an enormous noise about machine learning. But in fact, machine learning is actually used in only a relatively small number of tasks. And machine learning in the form of neural networks, which have been trained from data are almost entirely used for decisions where mistakes are cheap. You need to make lots of decisions, like which advertisements to display to somebody after a Google search.

If Google displays you completely inappropriate advertisement well it's just prevented itself from earning on average, a few pennies. It's not a serious mistake.

This technology, although it's kind of promising, it appears capable, it's not yet deployable, for all sorts of problems, simply because if you train a neural network from scratch on a large amount of data, what happens you get a classifier that works.

The trouble is the classifier hasn't really understood the data. And it may make completely wrong predictions on related data, which is outside its initial distribution. So there's no it's very hard to guarantee that the network is making its decisions for the correct reasons.

The second problem is that, I mean people are making immense efforts for interpretability and explainability, but that it is still very difficult for for example, a doctor to take the result of an artificial neural network which presents some diagnosis and then to discuss the diagnosis with the network. There are simply, they can't communicate with each other.

Now the explainability methods are getting quite good for neural network developers. But they're very far from being able to discuss reason. Reasons for possible interpretations. This goes back to the neural network not actually understanding, the reason why it's making its predictions.

It's fitted in immensely complicated decision surface to a large amount of data. And this will generalise under certain conditions. If, as a result of some conceptual breakthroughs, we get beyond that, then the possible areas for for application of AI become much greater.

And problems become greater. Now, a very important reason, which I think is, so people don't widely, well I'm sure people have recognised it, but it's not widely recognised, why it's almost conceptually wrong to employ this type of trained network AI for difficult decisions, is that there are many kinds of decision problems, where we are working out what the decision surface is, as a result of the data we get as a result of thinking about that data.

For example, legal decisions. I always prefer specific examples. We want to, if we, if a student wants to put in for extenuating circumstances because they couldn't do some piece of work and they want us to take that into account, then there's a certain deadline by which they have to do it. But should we relax the deadline if the student has a communication impairment, which means they find filling in the form of the extenuating circumstances form itself difficult?

Reuben: Right.

Chris: Well, that's a difficult question. We need to think about this in terms of natural justice, and and setting precedents and all the implications of the decision. It's not just a training example.

Reuben: Extrapolation has to be worked out.

Chris: Extrapolation, one wants extrapolation with understanding, you need to understand in some way the basis of the reasons for these decisions. And that's simply not there.

So this means that the applicability of AI to problems where mistakes are expensive, is much less than you think it would be and then where one can get disasters where you try to apply it, for example, if you have an AI system to try to give recommendations on where the parole should be granted. I feel this is not the kind of problem to which simple pattern matching systems should be applied.

In a case like that, you can well imagine the government understanding this and stopping doing it.

Chris: Yes.

Reuben: But if you have two companies competing, and their risk is substantial from using AI systems for automated decisions, but the risk of not using them is becoming bankrupt because they're out competed by the company that is using them.

Chris: Right.

Reuben: So you could end up with this race to the bottom.

Chris: Absolutely.

Reuben: All companies are sort of handing over decision making control to AI systems that they would rather nobody was using.

: One might argue this is already happening in financial trading. In financial trading, again, we have one of these game theoretic situations where the speed of trading has gone up and up. And so now trades are made. I don't know what the timescale is microseconds. I'm not sure.

Reuben: I think it's almost nanoseconds now.

Chris: Almost nano seconds now. Well why, why does anyone need to trade on a nanosecond timescale? What economic purpose is served by this?

: Well, they claim it's price discovery. I'm not sure we do to know the price that quickly.

: Erm, I do not believe them. I believe that the reason is that in this very complicated adversarial game of trading, the person the organisation that can trade faster on a shorter timescale, will win, because it manages to front run the other one, or bluff the other one I don't know, it's very closely guarded exactly what's happening. Now, is this economically beneficial?

This is exactly one of these game theoretic situations where we're using technology, and there are enormous incentives, a lot of money has been spent in applying technology to increase the speeding suite of trade into nanoseconds.

They claim that the market is deep and liquid, this has been disputed by economists, and I would ask a very simple question, which is that if this is saving ordinary investors and traders who are trading for human purposes on human timescales money, how come all these high frequency traders are making so much profit? Where does this come from?

Reuben: Right.

Chris: It comes from the slower traders. So I don't know if one could describe the high frequency traders as systematically front running everybody in the market. It's very hard to find out because I'm sure their methods are proprietary.

But it's very hard for me to see this as a social service. So this is an example of a very unwelcome, game theoretic situation where people have developed socially destructive technology. Maybe it's just useless, but if it's very profitable for them, they love it.

: But they're spending millions on undersea cables to transmit trades at the speed of light. And presumably, all they do is undercut each other. They don't provide additional value to society. And they'd rather none of them had to spend that money on undersea cables.

: Well, absolutely, I think it was even the they set up the microwave radio task, because of course, breakers travel faster through travels through fibre. So where you can do it, you use point to point radio type connections.

This is also an example of institutional capture. The arguments about whether this is socially useful or not rather complicated. Maybe there are some arguments that it is useful that there is a valid argument that it's useful, and maybe there are valid arguments that it's not, I would suspect not. But banning it? How do you ban it when it's making so much money?

: Yeah. So a country that bans AI technologies would fall behind other countries. And so again, you've got this situation where you're willing to accept the risks of the technology itself, because they're less than the risks of other countries beating you to it?

: Well, yes, I mean, you get these game theoretic situations, and certainly in making decisions, there is a pretty often a premium in making the decision first. There are very many situations in which you want to make your decision faster than the other guy, and you make money if you do. And so this will lead to AI being applied to that.

Of course, the extremely dangerous one is in recognising and nuclear attack and launching a retaliatory strike. And there have been famous, famous examples where the timescales for all the timescale for this has gone down to less than our, say 15 minutes. Which means that this is the result of exactly a game theoretic problem with this type.

: One problem of credibility in nuclear war is the idea that the human in the loop might decide not to respond with a retaliatory strike. So one solution you might have here is to broadcast to all the other countries, we don't even have a human in the loop, we have an AI that decides whether to perform the retaliatory strike.

: I mean, here we're getting into things like the plot of Dr. Strangelove, right. And hitherto the reliability of computer systems has been such that there have been humans in the loop. And it is believed that had been one or two occasions when the human decided not to order a retaliatory, or not to pass the message up.

So I think it's some, it's very important to say that the present state of AI, it's really not intelligent. And it's perhaps rather less useful than it's commonly believed. Um, there are rather few applications of AI still in medical diagnosis. But people have been able to do under controlled conditions machine learning with logistic regression, and beat doctors the consistency of doctor's diagnosis for decades for a lifetime.

: We can even beat them with pigeons, at least an ensemble of pigeons. One pigeon is barely above chance, but an ensemble is is competitive.

: Haha, I see. But, but we that there is now far more intense, far more researchers coming into the field. And there's far more intense development and lots of very clever people with good ideas.

And many of the people are working on very tightly defined problems, but I tend to believe that there will be, there are some undiscovered breakthroughs which are quite likely to happen. Something which can reason and explain his reasoning, doesn't seem as if it should be impossible.

: So we've talked a lot in the abstract, but maybe we can try and put some some numbers on this. Obviously, predicting the future is a fool's errand. But I think it at least gives us a bit of grounding on what we mean by likely and unlikely if we put some numbers on them if only them being, oh, greater than the chance of getting heads or less than the chance of getting a six on a dice.

So if you will be a fool for a moment, do you have a prediction for what date you think we'll have a 50-50 chance of building human level AI by?

: Okay, well, the difficulty is I want to pluck something out of the air, but I'm not going to do that. Well I am going to do that. But I'm going to do it in a slightly more structured way. I feel the long route to doing that, the long way round, is to have some idea of what the cerebral cortex is doing.

So some pretty good idea of what a column in the cerebral cortex does, and roughly how they're connected together. And so some sketch of its functioning. You only need a hint to be able to start building machines. So if we get some idea back, some significant idea back from the cerebral cortex. So I mean, I'm gonna get I'm going to guess, within 30 years, for getting the strong hints at, it could be much sooner, I've really no idea.

Reuben: But around 2053.

Chris: Around 2050, something like that. And once you have the hint, if there is significant feedback from, neurosciences so once you get something useful from neuroscience, then you're talking very rapid development, and technical applications, because developing technical applications is cheap, programming  is cheap, programs are effective. We've had progress in so many other technologies.

Reuben: We will have even cheaper computation then as well.

Chris: And even cheaper to computation, exactly. So at the outside, would be my opinion.

Reuben: And that's the slow route.

Chris: That's the slow route.

Reuben: Okay!

: What's the quick route?

: Oh, I believe that the current paradigms in AI are considerably, much more limited than people are making out. But it's not at all clear, until it happens, what's better. And many people are working on excellent approaches, causal modelling, and so on.

And I think it's very, very much harder to say. I'm sidestepping the question and saying, new advances get adopted worldwide, in the open source dissemination within weeks or months. And the research cycle is that papers on them happen in the conferences within six months or a year.

: So if I might push you a little bit, you said there's a there's a slow route, which would take roughly 30 years.

Chris: At most.

Reuben: Okay. But taking into consideration all paths by which we could get there, what would be your 50-50 date?

: Again, well, how would I do that, how to think how to think about that. Well, in 2005, we were still really excited about support vector machines. And nobody was using GPUs. And well, no, let's go 2001, I think roughly, was the first viola filter, which used boosting and these little rectangular filters to identify a face, locate a face in a picture in real time. This is now standard equipment on all cameras.

But I remember I remember actually seeing the demonstration that at NeurIPS at the time, and everyone was blown away, did it at five frames a second. I said Wow, you can find a face in the picture, reliably five frames a second with about 200 of these rectangular filters, integrative filters, lots of very clever programming, and a very clever algorithm based on boosting to do that.

And so from then to now, there's a lot of progress. So that's 20 years. So I'm going to say there'll be unrecognisable progress within 20 years.

: Human level AI progress.

: Um, capabilities, maybe some of them not as much as humans, maybe some of the more, as usual. Computers always, always have better capabilities than people. But capable in many more applications than today's AI.

: Capable enough to write a scientific paper?

: Why not? That's probably one of the easier things.

: The median paper perhaps. Okay, and then once that happens once we have a suite of human level capabilities, what probability do you place on the outcome being net positive?

: I think it's quite high that. On the basis that technology has been continuously good for us things have continuously got better. There are very few places in the past where you'd want to go, you certainly wouldn't want to go back in time and start life again, as anyone except a very rich, very wealthy person in previous centuries, and even then, you'd greatly miss certain things. So I can say, it's probably a very positive, but...

: Despite these game theoretic theoretic...

: There are these games writing problems, yes. And many other problems. Max Tegmark's argument, question of what if people are out competed by AI, what is there for people to do? How do we value human achievement? It's a very good question.

: Okay, so we got three minutes. Right. Can we do some like, very rapid questions?

Chris: Yes.

Reuben: So like maybe one sentence, if that's possible?

Chris: I'll try.

Reuben: Did you see deep Q learning coming?

: No. I mean obviously one tried to do it.

Reuben: Right, you tried yourself?

Chris: Well, very badly. I mean, not with I mean, in the last century. And of course, it's thoroughly unstable, and it's very hard to get it to work. Though, we had some problem in that EU project, in work package, one, people doing work in package one kept on leaving to found companies and things.

: So a few years after you published the Q learning algorithm, you published a proof that it converges to an optimal policy, given enough training data. Why does deep Q learning work?

: I mean, it works through I think tender loving care, very good choice of problems and good programming. And through getting stability by essentially converting it into a series of problems of discrete improvements of the value function.

But I it think it is, partly, the point is it works on arcade games, and arcade games are designed to be appealing to play. So you've got a reward system, you don't put off your novice player by making it impossible for them.

So this is a very friendly environment. Similarly, games like chess and go, Well, these have been developed to be very interesting and challenging for people to play.

: But we don't have the same theoretical understanding for why deep Q learning works.

: I don't think so. But, um, there's a lot of theoretical work on Q learning at the moment, on reinforcement learning. I didn't think so. I mean, of course, frequently, it doesn't.

: You miss the days when we could understand the algorithms we developed in such detail?

: I don't know, it's so exciting now.

: Well, Professor Chris Watkins, thank you for coming on the show.

: Thank you.

: You've been listening to steering AI from the Department of Computer Science at UCL. Subscribe for future episodes, join the conversation on Twitter with the hashtag steering AI, and follow us on the account @UCLCS for announcements of future guests, and your chance to suggest questions. Thank you for listening, and bye for now.