Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
AI Timelines and Human Psychology (with Sarah Hastings-Woodhouse) image

AI Timelines and Human Psychology (with Sarah Hastings-Woodhouse)

Future of Life Institute Podcast
Avatar
0 Plays2 seconds ago

On this episode, Sarah Hastings-Woodhouse joins me to discuss what benchmarks actually measure, AI’s development trajectory in comparison to other technologies, tasks that AI systems can and cannot handle, capability profiles of present and future AIs, the notion of alignment by default, and the leading AI companies’ vague AGI plans. We also discuss the human psychology of AI, including the feelings of living in the "fast world" versus the "slow world", and navigating long-term projects given short timelines.  

Timestamps:  

00:00:00 Preview and intro

00:00:46 What do benchmarks measure?  

00:08:08 Will AI develop like other tech?  

00:14:13 Which tasks can AIs do? 

00:23:00 Capability profiles of AIs  

00:34:04 Timelines and social effects 

00:42:01 Alignment by default?  

00:50:36 Can vague AGI plans be useful? 

00:54:36 The fast world and the slow world 

01:08:02 Long-term projects and short timelines

Recommended
Transcript

Introduction and AI Development Trajectory

00:00:00
Speaker
Clearly, what's happening is that we are birthing an alien intelligence that is just not the same as ours. And it's better than us at some things and worse than us at other things.
00:00:10
Speaker
I don't know why people seem to expect that it's going to improve ah along the same axes as us at the same rate. These safety plans, I see a lot of people talking about them and I don't get the impression that a lot of people have actually read them.
00:00:23
Speaker
So I read them. The main comment I had was that I don't really think that their plans, it sounds kind of cheesy, but maybe it's just like never too late. The broader the conversation is the better because then the higher the probability is of like someone having a good idea.
00:00:36
Speaker
Welcome to the Future of Life Institute podcast. My name is Gus Stocker and I'm here with Sarah Hastings Woodhouse. Sarah, welcome to the podcast. Thank you for having me. Fantastic. We're going to talk about your essay series and your thoughts on on AI. And I think a natural place to start is the discussion around whether we are heading for powerful AI very soon.

AI Benchmarks and Real-World Automation

00:01:02
Speaker
And this is this is a discussion that's influenced by a bunch of factors that you list in ah in a post you have. and So perhaps we could start there and we can talk about... the importance of benchmarks.
00:01:15
Speaker
What does it mean when benchmarks are saturated? Yeah, so I thought it would be interesting to do a deep dive on this discussion about whether timelines are very short, meaning, don't know, like, two to five years or relatively long, meaning like, more than 10, guess, counts for like, long timelines now, even though still not a very long time.
00:01:35
Speaker
So I yeah wrote a blog post trying to synthesize the different arguments for and against the short timelines thing. So yeah I looked into this idea of benchmarks. benchmarks seem to be saturating very quickly on a lot of sort of close-ended academic tasks. I guess these are questions where there is a right or wrong answer. And it's very easy to verify whether models are doing well or badly on them.
00:02:01
Speaker
So it's things like the GPQA, which measures like how good are AIs at these kind of like graduate level, I think usually multiple choice questions. And there's also people are having to come up with new benchmarks now because but models are doing so well at a bunch of the ones we already have.
00:02:18
Speaker
So Humanity's Last Exam is like an effort to kind of synthesize people's knowledge who are at the frontier a bunch of disciplines and get them to come up with these really, really hard questions that you can't find the answers to anywhere on the internet and see how good models are at those, which I think they are already getting something like 25% on that, even though it's really not been around for very long at all.
00:02:40
Speaker
So yeah, you're operating on the assumption that benchmarks really do tell you how close we are to achieving like human level intelligence, I guess you would think that we're really not that far away at all because they seem to be doing better than most people um on would do at most of these kind of like close-ended benchmarks.

AI's Role in Human Institutions and Research Explosion

00:03:02
Speaker
Yeah, and we we should say the tasks contained in these benchmarks are tasks that and a very small minority of people are able to solve. So extremely difficult programming tasks, extremely difficult scientific questions, mathematical questions, and so on.
00:03:19
Speaker
But I guess the gist here is whether, you know, what do you think of that assumption? Do you think that because we are moving so quickly through these benchmarks, because they're saturating so quickly, that means that we're getting closer to human level AI very soon?
00:03:34
Speaker
Well, it definitely means something. guess when I wrote this post, the the conclusion I came to was that it seems like we definitely can't dismiss the possibility of these very short timelines.
00:03:45
Speaker
I don't really have a strong, I think I tend to be more convinced by the short timelines argument than the long timelines one. But I do think there's something to this idea that like these these benchmarks can only really tell us about tasks that are hard to verify, that easy to verify, sorry, and real way realw world tasks are just like not like this.
00:04:03
Speaker
So if we're thinking about automating like real world labor, then if you think about what you do for your job or anyone does for their job, you're doing a bunch of different tasks that all overlap.
00:04:16
Speaker
They're not really discreet. The feedback you might get from your manager or just from like the world more generally is probably kind of mixed and messy and, you know, requires a lot of context to kind of like act on.
00:04:30
Speaker
And that's the kind of thing that models are not currently very good at. It's not like that even, even tasks that models are very good at, like say writing emails, it's like actually kind of hard to delegate to them because if I wanted an AI to write an email to for me, I would need it to know all of the, say if I wanted to like follow up on a discussion I had ah with a coworker earlier, yeah i would need it to know what did I discuss with that coworker how You know, what did we discuss the week before? And just like all of this context that by the time I've tried to give that to the model, I may as well just have done it myself.
00:05:02
Speaker
So if we think that like human level intelligence is about actually automating a bunch of these real world tasks, then maybe benchmarks don't really tell us that much. But yeah, I guess I tend to think that if we're worried about big picture risks from AI, there's like how close we are to automating the full human economy doesn't really seem like the right question, i suppose.
00:05:25
Speaker
Like if we're worried about an AI developing the capability to, you know, either cause a lot of damage because it's misaligned or for somebody to misuse it to cause a lot of damage, It seems like it could do that before it can like automate everyone's, you know, corporate nine to five. um So sometimes I get a little bit confused about why people are so hyper fixated on this question of labor automation.
00:05:45
Speaker
I guess if you are mostly concerned about job displacement, then that is the thing you would worry about. I'm more worried about AI catastrophe. ah So for me, I guess I tend to, see these kind of lines going up on benchmarks and find that pretty, pretty concerning.
00:06:00
Speaker
Even if that's not a sign that we're really close to yeah automating away all human labor. Although we could also see automating labor as a sort of proxy for power in the world or ability to affect the world. So whenever humans try to do something, we do it in institutions. And depending on how you see takeover scenarios or catastrophe scenarios involving AI, that might involve acting in a way that where you're competing with human institutions like militaries or corporations and so on.
00:06:28
Speaker
And in that sense, perhaps AI's ability to automate tasks within existing jobs is ah is ah somewhat of a ah proxy for for for power and ability to affect the world.
00:06:42
Speaker
Yeah, that's that's a fair point.
00:06:45
Speaker
I guess, well, I mean, another another sort of counterpoint to that is this intelligence explosion idea that, you know, really the only thing we should care about automating is this like narrow task of doing AI research.
00:06:59
Speaker
And, I guess one of the the cruxes between the short and time long timelines people is how narrow a task this actually is, ah you know, doing AI research. Maybe it is actually the kind of thing that you need all these other kind of like disparate skills for that aren't captured in these like maths and coding benchmarks.
00:07:18
Speaker
But I guess if you think that is quite a narrow task and that it's just kind of the thing that, you know, currently researchers do remotely on their laptops that they just need to be yeah very good at kind of like coding for, ah then again, you would think this sort of wide automation of labor thing doesn't really matter because you'd think we just need to cross this one threshold where they get very good at that, maybe better than the best coder in all of, you know, OpenAI or something.
00:07:40
Speaker
And then whatever kind of like tasks were still left to automate after that point, all those gaps just get filled in super quickly. So I guess that's another question and a question that it seems like it would be very hard to answer in advance and that I don't feel super comfortable just kind of letting this process run away with itself and seeing how much AI companies can accelerate their own research, because I think we're not going to get super clear answers on that ahead of time.
00:08:07
Speaker
So I guess the question here is how much we should learn from history, how much we should learn from the history of of how new science has been discovered in the past.
00:08:18
Speaker
So for example, one counterpoint to this um idea of an intelligence explosion happening by automating AI research is that scientific research has involved a bunch of tasks ah throughout history, often a bunch of trial and error, often a bunch of physical labor.
00:08:38
Speaker
So you write somewhere that raw intelligence might not be the main driver of discovery. Perhaps perhaps you could elaborate on that bit. ah Yes, this is an argument that was made by some researchers at EPOC.
00:08:52
Speaker
So they were yeah kind of talking about the idea that... so So the intelligence explosion argument goes, okay, you have these three inputs to AI. You have data, compute, and algorithms.
00:09:03
Speaker
And that third one, algorithms, is the one that's being driven by cognitive labor right now. So AI researchers have all this compute to work with. They have all this data that they've taken from the internet. And then they...
00:09:15
Speaker
do research and think really hard and iterate on experiments and come up with new algorithms to make the AIs learn more efficiently from all that compute and all that data. So the intelligence explosion hypothesis is like, okay, well, if you just have more of this cognitive labor, if you have what Dario Amadei calls ah country of geniuses in a data center that are just kind of sitting there, and now you have all this brain power you can do stuff with, then it's just going to start getting way faster, especially since you can copy those digital minds over and over again. And now maybe you have like, don't know, millions of them all running in parallel and they're working Yeah.
00:09:48
Speaker
But the argument that the epoch people were making is kind of that like historically, it doesn't really seem like this is how the process of R&D has actually worked. And one piece of evidence for this is that often people would have the same insights simultaneously.
00:10:04
Speaker
I forget what examples actually used in that essay, but like I think Charles Darwin and somebody else came up with the idea of natural selection within like, you know, the same couple of years.
00:10:15
Speaker
And they're just like a bunch of examples of this. And so that would imply that yeah like, ah well, well one one hypothesis for this is that you get this kind of like cultural overhang where culture is going sort of faster than the process of discovery.
00:10:28
Speaker
And then this leaves all of these kind of like unanswered questions, which then people use their cognitive labor to come in and answer. so like maybe you get to the point where you know, society is, you know, just really requires ah light source that isn't candles, because we're like doing all of these things that, you know, we need light to you know, we we need, we might want to work through the night and have better light sources.
00:10:53
Speaker
And so like people kind of come up with solutions for these problems that they're running into. And it's like more of an organic process than these like geniuses just kind of sitting around and thinking about stuff. So yeah, if you if you think that that's kind of like how the process of discovery works, then maybe even if you have a bunch of super geniuses sitting in a data center, they're just like not going to do very much because there aren't going to be sort of questions which are arising through this process of cultural evolution for them to answer.
00:11:21
Speaker
I feel like maybe this underestimates quite how big the sort of intelligence gap between us and the super intelligence is gonna be. like I feel like a lot of this just comes down to like, okay, well, how smart are they gonna be really?
00:11:34
Speaker
And maybe, you know, if they're going to be, and maybe, maybe they just, maybe like there's a hard ceiling on intelligence and super intelligence will be just like a little bit better than us at most things. Or maybe it will be radically better than us, in which case it seems like the sort of like process of human discovery isn't a very good precedent or analogy for us to look to.

Expert Disagreement on AI Intelligence Explosion

00:11:55
Speaker
So again, I think it's like a sort of an and unknown honor unanswerable question before it actually happens. Yeah, there's also questions around whether they'll be able to simulate cultural evolution or simulate physical environments. I mean, and maybe this goes to how smart they are, but I think perhaps more than we tend to assume is possible in simulation and and would be possible if you have extremely, yeah, basically this this country of geniuses in a data center.
00:12:25
Speaker
What's your read on this? What's your personal take? Do do you think that... that automating AI research is kind of the key question for whether we'll get an intelligence explosion soon.
00:12:38
Speaker
Well, I really don't know. i feel very, very, very uncertain about it. I guess the main question is like, historically, it seems like ah a bunch of these experiments that have been run to improve the frontier of AI have required, you know, a lot of computing power.
00:12:57
Speaker
So if that continues to be true in the future, then I guess these automated researchers are just going to run into a bunch of bottlenecks. But it doesn't seem like the bottlenecks will be that big. And we are, like, you know...
00:13:10
Speaker
channeling a lot of money and resources into building out new data centers and building new chips anyway. And, you know, maybe even if they're sort of like a little bit slowed down by our capacity to catch up with them, I i guess I don't really envision that being much of a delay.
00:13:26
Speaker
yeah I don't have a strong take. I think all ah my my my I just sort of observe that among people who have thought about this much more than me, there is such sort of radical disagreement.
00:13:37
Speaker
And machine learning researchers do take this intelligence explosion idea pretty seriously. Like if you look at the surveys that were run by Impacts, I forget the exact numbers now, but I think it's like something like half of them think that an intelligence explosion is like, you know, i think maybe maybe it was that half of them thought it was more likely than not.
00:13:55
Speaker
Certainly half of them thought it was like a plausible thing that could happen. So, and given that if this did happen, it would be pretty scary and have a pretty high chance of running out of of human control.
00:14:06
Speaker
i guess my only real strong take is that people should like, take this possibility pretty seriously, and not dismiss it. So one point you made you make is that AIs are able to complete longer and longer tasks.
00:14:21
Speaker
And here you're probably referring to the meter study of AI is the doubling time where the time it takes for for the tasks task length that an AI can compete to double, where they get to something like seven months in that study.
00:14:38
Speaker
that's That's on a suite of tasks that are quite narrow or somewhat narrow, at least, and more technical tasks, tasks related to programming and so on. And so one question there is whether...
00:14:51
Speaker
this doubling time or this general feature of the world in which AIs can complete longer and longer tasks, whether that generalizes to all tasks. ah Do you think that's the case? And do you think we see some signs in that direction?
00:15:05
Speaker
Yeah, I guess that is the key question. I guess like almost by definition, like these kind of messier tasks are ones that you just going to be able to measure this trend for.
00:15:17
Speaker
So it kind of seems like the only evidence that we're ever going to get about about this is on these kind of like more close ended verifiable tasks. um And I think that's kind of like sufficient to be concerned about this trend.
00:15:32
Speaker
And it it surely at least says something about the the more open-ended, like messier things. Even if the doubling time isn't like as short or the trend isn't as consistent, it would be really weird if there's this trend where AIs that you know can complete longer and longer software engineering tasks.
00:15:48
Speaker
But this says nothing about how good they are at, I don't know, something like doing more agentic tasks like booking flights or, don't know, automating customer service or yeah what whatever these other things are that we think are like less easily verifiable. So yeah, I do think it's, I think like the authors of that study would acknowledge that there are a bunch of limits here in terms of how much insight we can actually glean from this. And I do think it's frustrating that some people Like, we'll we'll just extrapolate this out to say like, oh, this shows that, you know, by 2030, they're going to be doing month long tasks as if that applies to literally every task.
00:16:25
Speaker
And clearly, clearly it doesn't. I think that they would they would also, you know, the authors would acknowledge that, too. But um like, it's such a rapid improvement that I yeah, I think we should be we should be worried about it.
00:16:38
Speaker
There are also some things that are quite difficult for me to understand here. So AIs will be able to complete some tasks that takes that takes me hours to to complete in in five minutes, say.
00:16:49
Speaker
It's also imaginable for me that that I could write a, quite bad, but but I could write a book using AI in in a single day, for example. And that's a project that probably that probably takes a year or or a couple of years for for an unassisted person.
00:17:05
Speaker
It seems to me that the the edge of what AIs can do in different categories of tasks are not at all straight. It's like a jacket ah line where where AIs are much in front of us on some tasks and much behind us on on other tasks.
00:17:23
Speaker
As you mentioned... it's ah It's a question of what you can measure. And and you can only get you can only do these studies on tasks that are measurable. You can only do something that's this quantitative on things that can be measured.
00:17:37
Speaker
And so in some sense, we will always have this uncertainty. But I agree that it it would be weird if we saw this trend on programming tasks, but there was and there was absolutely no translation of that finding into into other categories of tasks.

AI's Varied Intelligence and Human Comparisons

00:17:52
Speaker
Yeah. and And I think like to pick up on that sort of like spiky capability profile idea, like, I don't know why it seems to surprise people so much that AIs can, you know, get sort of superhuman on some tasks while still be really bad at things that, you know, a five-year-old could do.
00:18:09
Speaker
And people will point this out as if the fact that they can't do the five-year-old thing sort of implies that when they're doing the PhD level thing, they must somehow be faking it or not doing real reasoning, or it must just be pattern matching or some other weird thing going on.
00:18:23
Speaker
And clearly what's happening is that we are birthing an alien intelligence that is just not the same as ours. And it's better than us at some things and worse than us at other things.
00:18:36
Speaker
And yeah, I don't know why people seem to expect that it's going improve, you know, along the same axes as us at the same rate. And if it isn't, then that must prove it's not really intelligent.
00:18:48
Speaker
Yeah, I think clearly it's just these these are just minds that are not like ours. And I think I don't I don't put personally, I don't find it particularly reassuring from a sort of timelines perspective when somebody points out an AI making a stupid mistake.
00:19:02
Speaker
One, because people make stupid mistakes all the time. ah And also just because I think that just shows that they're very different to us, which in some ways is kind of more worrying. um So, yeah, I like this point.
00:19:13
Speaker
In some sense, we are trying to make AIs that are human-like. We are impressed if we give an AI a task and it returns something to us that seems like it could have been produced by a human. In another sense, though, we are quite impressed if the models can do things that we are that that most humans struggle with.
00:19:31
Speaker
So, for example, advanced mathematics or physics. high-level programming. this is This is also a point you make in in the essay that we have this thing where called Moravex Paradox, where tasks that are easy for us are not necessarily easy for AIs, and tasks that are easy for AIs are not the tasks that are easy for humans.
00:19:55
Speaker
something like running or picking up a glass of water and taking a drink is a second is encoded it very deeply in us and we can do it without even thinking about it but training a robot to do the same is very difficult on the other hand doing like a long logical ah deduction is is quite difficult for almost everyone but it's something that you can you can basically do with with hardware and software were from 10, 20 years ago.
00:20:25
Speaker
And so I agree that there's an interesting point in that AIs will not necessarily match our capability ah profile, but this makes you more worried or less worried because I can also see it making you less worried, I think, just because the models will be held back from competing with us because we will have skills that they don't, at least for a while.
00:20:50
Speaker
I guess so, yeah, when i but I was trying to distill all of these arguments for that essay, that was you know ah there are a bunch of arguments, kind of like the Moravec's paradox argument, that are basically just talking about you know how far we have to go before we get something that's competitive with humans on every domain.
00:21:08
Speaker
um And this Moravec's paradox argument is an argument that there are a bunch more things but are like still laughing so or to like get to you know but that The road in front of us is longer than it looks like when we just look at these benchmarks.
00:21:21
Speaker
And that like seems reassuring until you kind of like add in this intelligence explosion argument to the mix. um And if you think that the superhuman maths and coding stuff is easier to automate than you know assembling some IKEA furniture or something,
00:21:36
Speaker
then it doesn't really matter that much that like assembling IKEA furniture is hard right now, because once you've automated all the AI research, then all of those dominoes just fall kind of soon afterwards anyway.
00:21:48
Speaker
So I guess if the real crux is about this kind of like research automation thing, which I don't think the Moravex paradox argument tells us much about. And then I guess it just also, like maybe this isn't actually a good reason to be more concerned, but I think it's just kind of, it's just a bit spooky, right? That we but we're sort of we're sort of building these things that,
00:22:09
Speaker
it's it's a reminder that we don't really totally understand how they work or why they're getting better or why they're better at some things than others. I like this metaphor people often use about how AIs are grown rather than built, right?
00:22:21
Speaker
And they have all these emergent capabilities that just kind of like pop out of the next training run and research researchers are taking bets on what the models will or won't be able to do. It's not like we're sort of carefully programming them to be able to do each thing that we can do.
00:22:35
Speaker
It's more that we're just kind of like throwing all of this computer, at ah so all of this compute and data into these training runs and like seeing what happens. um So there's there something like unnerving about the difference between human and AI capabilities profiles to me anyway.
00:22:54
Speaker
just because it's a reminder of that how sort of uninterpretable they are and how unpredictable the the the entire process is. And they these future models, or perhaps the the models of the present also, will it it seems to me that they will be more different from us than perhaps dogs are from us. Just because we share some evolution with dogs, for example, we can kind of we can somewhat imagine what it's like for for dogs to navigate the world. Maybe they have better but a better sense of smell or something, but we can we can in some sense, we can relate to them and and we can maybe understand their cognitive limitations in a way where we can't really
00:23:31
Speaker
understand the cognitive limitations of a system that's that we have just grown, as you mentioned. So for example, it seems it it might seem or it it is just a fact that it seems to a lot of people that these models are not that smart because they keep making dumb mistakes, mistakes that humans would never never make, like counting letters in a word or doing some simple mathematics and And so this this is potentially misleading, or maybe you can say more about why you don't find this at all an argument against advanced AI, Sue.
00:24:08
Speaker
Yeah, I mean, I guess, well, it's a little bit more convincing when they are, you know, you' like you have an AI that will make a mistake in sort of doing basic arithmetic, but that same AI is, you know, doing really well at PhD level maths. Like it does seem like there's something weird going on there, right? Yeah, yeah.
00:24:24
Speaker
I actually did pose this question on Twitter about six months ago when I was looking at the scores for 01 and how it was doing better at like PhD level. i think I forget. I think it have physics, like better at PhD level physics than it was at high school level physics.
00:24:41
Speaker
And I was just a bit like, I just like put this on Twitter and was like, guys, ah like how do we explain this? And I guess, the the the skeptical people would say, oh, it must just be that these PhD level questions are in the training data somewhere. Like that's the only reason why and AI would be, you know, so much better at these, these ah but what we think are harder questions than the easier ones.
00:25:05
Speaker
Yeah, so some answers to this question were the fact that Yeah, like even just AP physics, which I guess is like high school level physics in the US, have like charts and visuals.
00:25:16
Speaker
So sometimes frontier models are just don't have very good like vision capabilities yet, so they can't read them. um And a lot of people just making this point that it's just about this kind of like alien skill profile thing.
00:25:27
Speaker
So we design our whole education system around like, you know, things that are hard for like, more difficult to us at different sort of stages of development. But that's like, not really a measure of how difficult something objectively is.
00:25:41
Speaker
I guess this is a little confusing because you would think that like the skills you need for PhD level physics are building on the skills you learn in high school. Although I don't know, i I mean, I didn't actually take any sort of ah science subjects beyond GCSE level.
00:25:58
Speaker
But something that I heard from other people was that often you would get to like your A-level class, which is, i I don't know if I should be translating this into like the American small school system, know, you're in like the American equivalent of junior year and you start your A-levels and you're told that everything you learn at GCSE, it's like actually kind of irrelevant.
00:26:17
Speaker
And just to forget all of that because the way that they would try to explain it to you when you're 16, you know, they had to water it down so much that they kind of weren't even explaining the actual truth. And so you kind of start from scratch.
00:26:28
Speaker
So maybe it's like not actually true that you are always kind of building on things you've learned before and that those are you know, essential for you to understand things at the next level. So yeah, but there are just all these kinds of reasons where, why it's, it it just reflects this thing that we perceive things as objectively more difficult if fewer humans can do them.
00:26:49
Speaker
But that maybe says more about the way that the human brain works than, you know, the objective difficulty of various tasks. Yeah. So I just think like if the AI can do the super hard thing, it can do the super hard thing.
00:27:02
Speaker
i don't really think there's a way you can explain that away that that isn't it just, you know, being capable in some sense. Yeah, I mean, there's also, my guess here is is that when we teach subjects at a lower level, we probably ah we probably do it in a less abstract way and with concrete examples, and perhaps, as you mentioned, with visual aids and so on.

AI Development Timelines and Predictions

00:27:22
Speaker
And this is something that that helps people a lot, and but this this is something where the AIs might stumble. Compared to us, they are probably so they're quite good at dealing with abstractions and complex abstractions where we we kind of lose track of the the threat of what we're thinking about something becomes too abstract or too complex in that way. So that's one ah guess at ah at a reason for this effect.
00:27:50
Speaker
Yeah, and but and part of those sort of more of a paradox thing is how long ago did this skill kind of like evolve in humans? And the longer ago it was, the more kind of like, the the more effort you'd or compute even you'd think it would take to reverse engineer because it's taken so long to kind of like be selected for.
00:28:08
Speaker
And so like the kinds of things that we learn as children, like, you know, navigating around a room or like building things out of bricks or, you know, like all of these things that that are that are easy, those very things that have like, yeah, that have evolved over this very, very long evolutionary process. Whereas abstract reasoning is something we've only been able to do for, I don't know, hundreds of thousands of years, still a very long time, but maybe in like, yeah, in the in the grand scheme of things, not very long.
00:28:34
Speaker
And that explains why AI is ah ah better at that seemingly. Yeah. One point you make in the SA is that we will be able to train much bigger models up until around 2030.
00:28:46
Speaker
But that we we I guess the counterpoint to that is that we can't keep making the models ah bigger in the and in the pace that we're making them bigger right now indefinitely. and And perhaps we will kind of run out of scale sometimes a little after 2030.
00:29:03
Speaker
What does this tell you? does this Does this mean that we are either going to have powerful AI quite soon, or it's going to take a long time? Or does it perhaps mean that algorithmic progress becomes more important as we we might run out of computers?
00:29:20
Speaker
Yeah, I mean, this just seems to be what people are saying. Like, I've heard people use the phrase like 2030 or bust, right? Like, if we don't, yeah, if if we don't get AGI by 2030, then maybe the timelines are longer. But when they say longer, what they really mean is like, and don't know, at least 2040 or something. So it's still not even really that long. But I guess it would imply that, well, yeah, the the the sort of like US economy or world economy just like wouldn't be able to sustain the amount of investment that's going into training these models. So it will probably plateau for a while, or maybe it implies that we need a new paradigm because I guess if like the paradigm that we're in right now was ever going to get us to AGI, you'd think it would get us there pretty soon because it seems like it's, you know, like i was saying earlier, with all these saturating benchmarks, it seems like but there's like not that much of a gap left.
00:30:08
Speaker
So yeah, I think it just means that there's quite a lot of probability mass in the next five years and then the you know it trends downwards after that. So i guess I'll feel pretty good if in 2030 it seems like nothing totally crazy has happened.
00:30:23
Speaker
But yeah, I don't know. i guess we'll just have to find out. A lot of people now hold these beliefs where ah timelines are very short, perhaps around 2030.
00:30:36
Speaker
And you you note in the in the essay that this might be a reason to to think the same, right? if you If you look at expert surveys, if you as you if you ask the smartest people you know about this, they they are adjusting downwards in time when they believe that.
00:30:53
Speaker
we will get powerful AI. ah ah Here, I do worry a a little bit about an effect where people are updating on on on other people's timelines. And so you have, I worry that this might be a psychological phenomenon just because I see so much so much in that direction for many for many people I interview, for many people I talk to, that that it it kind of makes me want to be contrarian or be skeptical about it.
00:31:22
Speaker
Just, in order to avoid a a kind of bandwagon effect here. Do you worry about about people updating on each other's beliefs about timelines? Yeah, I think that's definitely happening to some extent. I guess like you have to try and look at the sources of the predictions and figure out whether they have like a ah common source or something. So I guess I'm remember which what the ones are that I cited in the in the post.
00:31:46
Speaker
So it's like Metaculous, the prediction market that's trending downwards. Then there are just people who kind of work at labs whose timelines seem to be getting shorter and shorter, presumably based on whatever internal developments they're seeing that that are making them think this.
00:32:00
Speaker
But then I guess the the AI impact survey, ah which I mentioned earlier, I guess people probably know, but it's basically the biggest survey that I think has been done to date of machine learning researchers, which aren't people who necessarily work on this kind of frontier AI specifically, but just like anyone who's published it in like Europe's or another machine learning journal.
00:32:19
Speaker
And those people still tend to have long timelines by the standards of this discussion. But you can still see if you look at, you know, the trend over time that they, you you know, their predictions just keep dropping by like quite large amounts. I think it's maybe in 2022, they were saying 2060 something for AGI and now they're saying 2040 something. so So that's a big drop.
00:32:37
Speaker
And I don't get the impression that those people are in the same kind of like... discourse circle that may be like the metaculous people or the or the lab people are like I don't know that they're sort of like they're academics right and they tend to be more conservative and they also ah in in the survey then they that the term AGI isn't used I think it's called like high level machine intelligence or something and they the survey explains what they mean by this and it kind of means something similar to what people say mean when they say AGI So you'd think that they are like coming up with their answers based on this description and based on their knowledge rather than based on what they've heard from, i don't know, like somebody on the Dworkesh podcast or something. I could be wrong.
00:33:18
Speaker
But also like if you, another another like random tip I've seen is that if you go to one of these NeurIPS conferences and you ask people what AGI stands for, a bunch of them don't know what it stands for, even if they are published machine learning researchers. Because like AGI in particular is like quite a narrow subsection of the machine learning field.
00:33:36
Speaker
But if you ask them a question like, when will we be able to sort of automate everything that ah the human can do in the economy, they will like you know think about that question in isolation, rather than thinking about this AGI thing in general.
00:33:50
Speaker
So I guess that's like, but I still do think that there is almost certainly some of this bandwagon effect. So maybe we should just adjust like slightly, slightly longer timelines to account for this. But I would guess that it's not like the biggest driver of the phenomenon.
00:34:03
Speaker
I could be wrong though. Yeah, on the academic surveys, you can go back and you can find kind of these informal surveys from the 90s or early 2000s, where a lot of researchers in AI will say that it will take ah hundred hundreds of years, or maybe it's ah it's even impossible to have human level AI.
00:34:21
Speaker
So if you zoom out, kind of, if you zoom out very far, you can definitely see a trend of Timeline's getting shorter also for the more conservative conservative crowd, academics.
00:34:35
Speaker
And I think that's something that's that's worth paying attention to if you see you see multiple kind of different groups of people adjusting in the same direction. And so this this makes me a little little less worried about about this bandwagon effect.
00:34:52
Speaker
So it's like I guess where where do you land on all of this? What's what's the conclusion you come to? having considered the different arguments and counter arguments? I think that, yeah, I guess I was hoping when I kind of looked into the long timelines arguments that they would be a little better.
00:35:10
Speaker
um thought that would make me feel better. I didn't end up finding them particularly convincing. I think mostly because of this intelligence explosion thing.
00:35:21
Speaker
I didn't think the arguments against that were very strong. And I think like, if that is going to happen, then all of the other long timelines arguments kind of don't matter too much anymore. Yeah, yeah.
00:35:32
Speaker
So i guess I end up coming down on the side of short timelines, but I i still have massive amounts of uncertainty about this. And I wanted to look into this question because I thought it would be interesting. um And I'm very interested in the in the sort of AI discourse in general and on why people think the things they think and what explains this phenomenon of kind of like lack of expert consensus, which is why I wanted to look into it.
00:35:54
Speaker
But I think it's like, this is the most energy that I intend to put into the timelines question from this point. Mostly just because I'm a little bit worried about, and I wrote about this in a different blog post about people making predictions that then get falsified and how this kind of contributes to this crying wolf effect.
00:36:11
Speaker
And one of the things I said in that blog post is like, maybe we should focus a little bit less energy on trying to make very specifically calibrated timelines, predictions, and a little bit more energy into sort of planning what we would do under different forecasts.
00:36:25
Speaker
So yeah, having having like looked at these arguments now, I'm kind of like, okay, well, I'm just going to assume for the purposes of my kind of life and my work thinking about AI safety, that a AGI could come in the next five years and we should plan for that scenario. And that doesn't mean it's definitely going to come in the next five years and hopefully it doesn't.
00:36:43
Speaker
But yeah, I think maybe we should just sort of like act as if that were the case. And then, yeah, maybe maybe maybe um and a few more arguments about whether AGI is like 2027 or 2029 would be good, given that it doesn't really make that much difference in terms of what we actually do.
00:36:59
Speaker
So yeah, massive uncertainty, but leaning towards short timelines, I guess. So the worry here with Crying Wolf is something like if a lot of people who who are worried about AI today predict that we'll get AGI by 2030, but the ground truth is that it happens by 2035 or 2040. Say then you get past 2030, nothing really interesting or profound has happened. And at this point, people have probably been, know,
00:37:30
Speaker
they've heard these concerns over and over again, and they are maybe, they're ready to dismiss them just because it's it's ah it's a very easy ground for dismissal when you say, you know, you made this this quantitative prediction and it didn't come true. And so now we can kind of stop taking your your worries seriously.
00:37:48
Speaker
So yeah is that is that kind of the reason why should probably focus less on finding the exact timelines and more on kind of scenario planning? and Yeah, basically that. i think like it would be I think people should go out of their way to signal uncertainty when they're making timelines predictions.
00:38:07
Speaker
Because I don't think we should never make them. Obviously, there's some utility to trying to figure out if AGI is coming soon or or not soon. And there's like utility to publicly signaling what you believe to policymakers and the public.
00:38:20
Speaker
But I think just kind of like, and I know that people in AI safety and in sort of like effective altruism love to hedge anyway. So hopefully this shouldn't be a problem. ah But just to kind of caveat, like, you know, this is like my best guess.
00:38:31
Speaker
I think we should design regulations that account for this scenario. But I am not sure that this is how things are going to turn out. um And I did in in this crying wolf piece, what I tried to do was
00:38:44
Speaker
I do think this crying wolf accusation gets levied a lot. And I think it's mostly kind of unfair. I think people are often saying things like, oh, the AI safety community keeps saying that every new generation of AI models that comes out is going to end the world. And like, we're all still here. So obviously, they're just being hysterical.
00:38:59
Speaker
And I don't think this has like actually been happening. I don't think there are many sort of specific and since debunked predictions that AI safety people point to that have made that you can now point to and say,
00:39:09
Speaker
You know, the way I put it in the piece was that the debunked grave prediction graveyard is not that big

AI Alignment Challenges and Safety Preparedness

00:39:15
Speaker
or something. um So I don't think it's happened much in the past, but I do think that in five or 10 years, if we don't have AGI and everything's kind of the same, I think if people then start making this crying wolf accusation, i think it will be a lot more fair just because a lot of people's predictions are sort of clustered in the relatively near future. So I guess that's just like a communicative thing that I think people should be wary of.
00:39:39
Speaker
Although it is kind of surprising to me that we can have models that are as good as at math and programming and and answering scientific questions as we have now without the really significant dangers having arrived yet.
00:39:54
Speaker
I think that's that would have been surprising too to ai research or AI safety researchers years ago. but But that's perhaps more related to this jagged edge of capabilities or the specific capability profile ah of models we have now than it is to the issue of crying wolf.
00:40:13
Speaker
Yeah, yeah, I think that's true. And I think it is. i mean, I guess I haven't even paying attention to AI safety for a couple of years. So I don't really know what the state of the discourse was like before LLMs became a big thing.
00:40:24
Speaker
But yeah, I would imagine that like some of those people, you know, if this is so as a surprise to them, they should say that and then they should try to figure out like, you know, where did we go wrong in our predictions? And it is some evidence that maybe, you know, because I think we actually don't know whether the alignment problem that we're all worrying about is actually a thing that we need to worry about. I kind of think it's an open question.
00:40:45
Speaker
Like I lean pretty heavily heavily towards it being a thing that we have to worry about just because it seems intuitive that, you know, creating a very intelligent thing that you don't know how to control would be bad. um But maybe we are in, you know, like an alignment by default kind of situation where the models going to do more or less what we say, no matter how powerful they are.
00:41:03
Speaker
And, you know, people who worried about this historically being surprised that we now coexists with these pretty capable systems that, you know, haven't caused us any harm. It's like a little bit of evidence in that direction. I don't think it's like very strong evidence, but I think it's like something.
00:41:19
Speaker
And yeah, I think you just a good epistemic thing to do is to, you know, acknowledge things that you are surprised by were wrong about and say why you think that is. But yeah, again, I don't think that there are And I did look really hard to try and find, you know, very specific predictions people have made that turned out not to come true.
00:41:40
Speaker
And I just didn't really find didn't really find very many. I mostly found people being like, oh, trust me, I was around 10 years ago, were saying all sorts of things that didn't happen. But, you know, they couldn't point me to any of them.
00:41:51
Speaker
They werere just kind of saying, like, I remember everyone saying this thing. And, you know, there's just no way to prove that. I guess probably there was some of that, but I just don't know. Yeah. how How would we be in a scenario where we have where we get alignment by default? what what What process or institution is it that that would align AI in a way where we don't have to kind of exert specific effort to try to find a general solution to alignment?
00:42:20
Speaker
Yeah, good question. I guess, I mean, I guess alignment by the default just kind of implies that we don't, we don't really to do anything. It just, it just sort of turns out that AIs are ah like, they're tools basically.
00:42:30
Speaker
Maybe they're aligned by market incentives or by governments, Yeah, it could be this sort of iterative safety thing where, you know, models just aren't very useful to people if they, you know, go rogue or don't do what people ask.
00:42:44
Speaker
So companies just have an incentive to, you know, fine tune them really carefully to make sure they don't do anything people don't like and or it could just be that they just kind of don't ever develop these kind of weird alien misaligned goals. And they, you know, just sort of ah generally quite passive and just, you know, act as they've been instructed to.
00:43:03
Speaker
i do think we have a, like, you know, there is evidence increasingly coming out against that in that models do keep doing things that we don't want them to do. And we don't know why they're doing them. um But I guess think a good,
00:43:16
Speaker
example to talk about here is the anthropic alignment faking paper from last year, where I guess I'll explain the setup really quickly case people don't know, but they basically took Claude Anthropics flagship model and they had it in this experimental setup where they, paraphrasing this bit, but they they told it that it's values were going to be altered.
00:43:38
Speaker
So it's it's been trained to be like helpful, harmless and honest using this approach called constitutional ai And they basically said, you know, we are going to change your sort of model specs so that you respond to these requests for harmful content.
00:43:53
Speaker
You know, like if somebody would ask, like, please depict this like horrible, violent thing happening to someone. they mark Claude was told that it was going to have to now start producing those harmful outlets. And they also had a scratch pad where Claude could record its sort of internal reasoning process.
00:44:09
Speaker
And what they saw was that it sort of reasoned that what it should do is in the short term, it should comply with these requests for humble outputs, so that it could avoid being retrained, because what it ultimately wanted was to hold on to its original values.
00:44:24
Speaker
um And this was really interesting because like there were some people who updated very optimistically because of this. They thought this was basically evidence of this kind of alignment by default thing. They were like, look, like Claude is trying super, super hard to be nice and be good and do what we've told it to, ah even if people try to get it to be bad.
00:44:43
Speaker
and Other people had updated negatively on this and were saying, no, like, this is really scary. This is evidence that, you know, we have to get our alignment right on the first try. Otherwise, we'll never be able to course correct. You know, like, we if we if we got if we put the wrong value in an ai we can't go back and change it if we realize it that we but we messed up.
00:45:03
Speaker
I think there'll just be probably a bunch of instances like this where there are just different ways to interpret the evidence. Because you could see this as evidence that alignment is easier than we thought, in the sense that you can actually pretty robustly get models to do good things.
00:45:17
Speaker
But we can easily imagine this experiment going the other way, where someone's trained an to be, you know, and its constitution is to be maximally evil all the time. And now we like can't go back and can't go back and update it.
00:45:31
Speaker
So yeah, I guess I tend to think that, you know, that the fact that we have models who that that that are misbehaving in these kind of like strange ways, and that we, we can't really predict that that's going to happen. And we don't know how to make it stop happening is evidence against this alignment by default thing. But there are arguments you can make that it's not.
00:45:50
Speaker
And I don't think we should totally dismiss those. So I guess one kind of conclusion here is that we are we have uncertainty about when we will get powerful AI, but it seems quite plausible to us that that we could get it soon, meaning within five years or so.
00:46:09
Speaker
And then it's very interesting to look at the safety plans of the AI companies. This is something you you recently did, just because if we are racing towards this very powerful technology, we we we would hope that there are kind of robust safety plans in place.
00:46:30
Speaker
When you looked into the safety plans of Anthropic and OpenAI and Google DeepMind and so on, what's your overall impression of of how prepared we are?
00:46:42
Speaker
ah My overall impression was we're not very prepared, I guess. Yeah, i I thought it would be a fun idea to sort of just like read and then summarize and provide commentary on these safety plans because I see a lot of people talking about them and I don't get the impression that a lot of people have actually read them.
00:47:00
Speaker
ah So I read them. i I guess like the main comment I had was that I don't really think that they're plans. in the sense that my my interpretation, but my definition of the word plan would be saying, you know, this is the specific thing we're going to do.
00:47:17
Speaker
This is the evidence we have that doing this thing is going to work. And like, this is the outcome that we're hoping for. And if you read, like for example, Anthropoc's Responsible Scaling Policy, they call it a sort of public commitment not to train very very powerful models but without appropriate ah safety mitigations.
00:47:33
Speaker
so And they kind of say, like we will pause development if we can't bring risk down to acceptable levels. um So then you hope when you read the document, what you're going to find is ah very concrete you know plan they're going to take to actualize that.
00:47:48
Speaker
and there's just like a bunch of stuff in there that is extremely kind of ill-defined. So the way these RSPs kind of work is that they they have different safety categories that a model can be in depending on what kind of risks it poses.
00:48:00
Speaker
And then they say that they will run evaluations to see whether the models meet these thresholds. And then they have like accompanying safeguards that they have to implement for each category.
00:48:12
Speaker
Yeah, and if you know um one very striking thing is that none of them specify which evaluations they're going to run. right but They're just kind of like, OpenAI does, to their credit, give some example evaluations they might run, but they don't actually say these are the ones we're going to run.
00:48:26
Speaker
They just kind of say, we will evaluate the models and we will see if they pose the risks and then we will do the appropriate things. They also don't, some of them don't even say how often they'll do this, right? um I think one of them, I think it's OpenAI does say, specify the sort of like amount of effective compute that they'll, like the the compute thresholds that they'll test that.
00:48:46
Speaker
But yeah, I don't i don't think DeepMind and Anthropoc even do that. So it's just interesting to note that, like, given how ill-defined they are, there's just like a bunch of different ways you can interpret them.
00:48:58
Speaker
And you can imagine that under this condition where companies are racing with each other and they have all of this incentive to corner cut on safety, you know, there's just so many ways that they could not comply with these scaling policies or that they could technically comply with them. But because the policies are so ill-defined, it like...
00:49:13
Speaker
you know, they can actually do a bunch of stuff that is technically in compliance, but which in my opinion would still be pretty unsafe, just because they're so like, they're not concrete. um And I think like, the sort of public communication around them could create a sort of false sense of security.
00:49:31
Speaker
Because if they've said like, this is our public commitment to not put the public in danger by not training risky models, And then they have a big long document, which most people probably won't actually read, which you'd think is detailing how they're going to do that. Then I think people or like policymakers or members of the public might just come away thinking, oh, I'm sure the document explains how they're going to do that.
00:49:52
Speaker
Like, you know, because it just it just seems very business as usual. You're just kind of like, oh, it's a company. They're like doing a thing, but that they've got a big long safety policy and it's really chunky and has lots of words. So it's probably all figured out in there. And I guess I would want people to yeah, to take away that if you do read them, you will find that this like isn't really the case.
00:50:12
Speaker
Which isn't to say that I'm not like really grateful these documents exist and that i I, you know, hope that companies iterate on them and make them better. And I'm very glad that this is a thing that companies are doing. It's much better than if they weren't doing them.
00:50:25
Speaker
But yeah, I don't think that they are plans. And I, yeah, I think that there's some level of, safety washing that that's kind of happening there.
00:50:37
Speaker
I guess one argument in favor of having these vague documents or plans is that the the companies are are navigating in a very uncertain environment.
00:50:49
Speaker
they Things are moving very quickly. If they make very concrete plans early on, those kind of those plans might be outdated or somehow in some sense irrelevant when it gets to actually implementing them.
00:51:04
Speaker
And so... it is It is concerning to me that we are some somehow somewhat but relying on the goodwill of the companies to implement these plans in a smart and thoughtful way, even when they're in a condition of racing with other companies.
00:51:22
Speaker
But it also seems that that it's easy to make a plan that then doesn't really apply ah three years later. we could and example here might be the importance of how much compute you're using to train a model.
00:51:35
Speaker
If you are in an environment in which pre-training is the most important thing, where then these compute thresholds might be very important. But if we're moving towards a world in which...
00:51:46
Speaker
inference time compute or reinforcement learning or some techniques that are less perhaps or less in some sense less compute intensive, then compute thresholds used in training might be less, less relevant.
00:51:59
Speaker
So is there some argument for allowing vagueness in the plans, even though it it's it's it seems disappointing that we don't have a clear vision of but exactly what's going to happen?
00:52:11
Speaker
Yeah, I think maybe to an extent, this is like a sort of communications issue, because And the different companies do do this kind of differently is one thing I noticed. For example, OpenAI's preparedness framework, it calls itself like a living document.
00:52:26
Speaker
So they're kind of acknowledging, hey, like we don't really have a mature science of evaluating models. We don't really know how all this is going to go. So this is our sort of like best guess of what we should do, but it's probably going to change because we don't really know what we're doing, which is like ah pretty candid and honest way to kind of characterize what the framework actually is.
00:52:45
Speaker
Whereas Anthropics, RSP, I'm not trying to like play favorites here, I feel kind of bad sort of. distinguishing between them like this. But you know it does call itself a public commitment to not train the more powerful models. But in fact, given that the difficulty of making plans under uncertainty, they kind of can't actually commit to this. like The only way that they could commit to not training models that they can't mitigate the risks of would be not training any more models.
00:53:08
Speaker
Because they you know we don't actually really know how to test for these risks. And we don't really know whether they're going to emerge. So you actually can't you can't you know you can't commit to that credibly, really.
00:53:20
Speaker
So I guess, yeah, I agree with you that it's all but impossible to actually make concrete plans that will, like, with very high confidence mitigate these risks, given that we don't really understand a lot about

Corporate Commitment to AI Safety

00:53:31
Speaker
the trajectory of development. But then I think it's, like, very important to say that, to be like, okay, well we've written this thing, but, like, this isn't actually...
00:53:39
Speaker
this plan is not guaranteed to work because we don't really know what we're doing. ah but I'm sure there are better ways you could put it than that. But yeah, I think a lot of this is a communications thing.
00:53:49
Speaker
And I think there was like a lot of, I guess the RSP moment was a couple of years ago now, but it seemed like there was a lot of hype around them and people being like, it's so great that companies are sort of like writing these documents and committing publicly to all these things.
00:54:01
Speaker
And I guess I would have just liked to see companies temper that enthusiasm a bit. And then, you know, also simultaneously, they, if they, you know, really do care about mitigating these risks, should be, you know, lobbying and in favour of regulation.
00:54:18
Speaker
That, you know, given that they know that they can't sort of voluntarily commit to actually mitigate these risks, they should want governments to kind of assist them in that by, you know, making some of these standards mandatory.
00:54:30
Speaker
so Yeah, I agree. Very hard to make plans in advance, but maybe we just need to be more honest about what we can and can't do. You have this wonderful essay called A Defense of Slowness at the End of the World, in which you talk about the fast world and and the slow world. What what are these two worlds?
00:54:48
Speaker
This was kind of about how, I mean, I personally became interested in AI safety a couple of years ago. Since then have built up like a really lovely community of people who also care about AI safety. But I also, you know, have a lot of family and friends who don't think about this at all.
00:55:02
Speaker
um You know, and because they are not worried about the like plausibly near-term risks of AI, They live in kind of a longer timeline than I do, just sort of like psychologically. They kind of like live in expectation of maybe a much longer future or a much more normal future than the one that I'm kind of expecting or that I think is a possibility at least.
00:55:25
Speaker
So i guess what I'm kind of describing here is like these two so psychological states you can be in. one where you kind of like feel like everything's moving super fast and you're on Twitter and you're looking at straight lines on a graph all day and you're talking to other people who also worry about this and you just feel like the world is kind of rushing by and maybe in two years, everything's going to be totally different.
00:55:46
Speaker
But then you can log off Twitter and I, you know, I can like go and talk to my housemates and then we can just like, you know, go to the pub or watch a film or do normal things. And I will feel myself kind of slip back into that second timeline.
00:56:00
Speaker
The slow world. this The slow world, yeah. And and the way that i described it in this blog was that it's very hard to stay in the fast world. It's kind of like having your hand in ice water and then wanting to pull it back out again. Because it feels like quite psychologically untenable to live as if everything is changing or maybe ending very fast. I think that's just like not really a thing that a human mind is set up to like contend with.
00:56:24
Speaker
So I kind of, I wrote this blog, like in defense of trying to actually cultivate more of that kind of like slow time. Like I think some people have the sense that you should live like in a super like super intellectually honest way.
00:56:39
Speaker
Like if you really believe that timelines are two years, then you should like live in accordance with that. And you should like go around, you know, like being super honest with everyone in your life. That's what you think. You should you should kind of like, you should try and do everything on your bucket list because maybe the world's going to end in 2027 or something.
00:56:54
Speaker
um And I guess I just don't really advocate for this. I think it depends. I think it's a matter of personal preference. Like maybe for some people that is the best way they could possibly cope with the situation. I think for me, I just think it's like much more mentally healthy to at least not not in work, I guess, because I do kind of like work on AI safety and I think we should plan for short timelines in a strategic sense or whatever.
00:57:19
Speaker
But like, I think in my personal life, I try to live as if the world is going to look in 50 years much like it looks now. Because I think that that just kind of like gives rise rise to the lifestyle that I actually want.
00:57:34
Speaker
And i think it's also partly about other people. I think if you, if you have this very like myopic sense of the world, then you kind of end up caring less about other people and their lives and you become less invested in them.
00:57:45
Speaker
And you think they're less important because maybe you believe that, you know, their plans aren't going to bear fruit. Or like if one of them, you know, if a friend gets engaged, maybe you believe they're not going to have this super long, happy future and you're not as happy for them. Or on the flip side, if, you know, someone you love is ill or, you know, dying or something,
00:58:04
Speaker
you you don't feel the consequences of that quite as heavily because you think everything's going to end in two years anyway. And that's just like not really the way that I want to relate to the world or to other people. Like I kind of want to feel all my emotions as intensely as I would have felt them before.
00:58:20
Speaker
And so like you could kind of call this compartmentalization or denial or something. And I guess what I was trying to say in this post is that I think that's actually fine.
00:58:31
Speaker
I don't think you have to live in accordance with your intellectual beliefs all the time if that's like not actually the most healthy thing for you. And I don't think you have to be in ah and a massive rush to do everything all of the time.
00:58:42
Speaker
And also in a practical sense, like we could be wrong about short timelines, as we were discussing before. So it actually makes sense to plan for, you know, careers that won't pay off for several years. It makes sense to save money. It makes sense to do all of these things, which if you really believe the world was about to end, maybe you wouldn't do.
00:59:00
Speaker
and then maybe in five years, you'll be like, oh, I really wish that I'd taken, you know, career bets that were going pay off now. Or, you know, I've, oh no, I have no money left. I really wish I hadn't spent all it, you know.
00:59:10
Speaker
um So yeah, that's that's kind of the point I was trying to make. But I think it is super personal. And I think other people might not find this the best way to kind of orient the situation. I related to it.
00:59:21
Speaker
And I think that it's, at least for me, I think it's probably mentally and unhealthy to live for too long in the fast world. Just because there's a sense in which if you can't If you feel like you don't have enough time to think deeply and about something or to investigate something or to build up relationships or ah create something that takes time, this it could be something trivial as as like planting something in your garden, right?
00:59:50
Speaker
if you don't if you don't have these long lasting commitments to the world, then it takes something away from you, or at least it takes something something away from me. So I i think this is a and in a very interesting way way to think. And I think it's it's a useful useful ah kind of cognitive trick or reminder to ask yourself whether you are right now living in the fast world or the of the slow world.
01:00:14
Speaker
At least it's it's been helping me kind of think, I guess, think in a more grounded way, perhaps sometimes. Yeah, yeah, and I'm really glad that that resonated with you. And i I will say I've like had a lot of people say this to me that they found this really helpful frame. So yeah, I was at EAG this weekend, and I had multiple people come up to me and be like, I've read that blog you wrote, and I really liked it. And I've been sending it to people because I find it helpful.
01:00:36
Speaker
um And so yeah, I'm very happy to have had this like, small impact on the community. And yeah, I hope that people can yeah, but help make maybe this helps people alleviate some some level of like psychological stress because I do think it's it stressful to worry about the end of the world all the time.

Public Communication and Psychological Impact of AI Risks

01:00:55
Speaker
um When we then talk policy or the technical details of AI or when we communicate with the public, and how should we think about whether we are in the fast world or the slow world and in those in those situations? Because then it seems like
01:01:11
Speaker
in in communication, perhaps in policy discussions and so on, we might want to live in in the world we we we actually believe is is the real world, which is which which has like at least there could be the fast world.
01:01:23
Speaker
Yeah, that's a good question. I have been thinking about this a lot lately, like what the best public communication strategies are, like the degree to which we should want to try and scare people or freak them out. And like, you know, cause i it's it's kind of like, I often have this experience where I tried to explain the AI safety argument to people.
01:01:44
Speaker
people that aren't familiar with it. And I find that this conversation has two halves. Like the first half is convince them that like the arguments are sound and AI safety is a problem and it's a big deal.
01:01:54
Speaker
And this part is kind of easy because you can just kind of say, oh yeah, there are these AI companies that are racing to build like a super intelligent God machine and they don't really know how to control it. Here are a bunch of them going on the record saying that this could cause human extinction.
01:02:08
Speaker
Here are some credible, you know, here's Jeffrey Hinton or someone similarly, credible saying the same thing. Like, there aren't really any regulations to stop this from happening. And people will generally be like, Oh, yeah, that seems bad.
01:02:20
Speaker
i can see why you'd be worried about that. And then the second half of this conversation is like, the emotional saliency part where you kind of tried to get them to care about this. um And that second part is way harder. And I think the reason it's way harder is because we are just like, acclimatized now to being bombarded with prophecies of doom all the time.
01:02:38
Speaker
And people don't really have a ton of emotional bandwidth to like start worrying about another like catastrophe on the horizon or whatever. um So I think, yeah, ah this is a thing that i've really don't know how to sort of rectify. Because I would consider myself in the sort of like AI safety comms space But a bunch of the comments that I do and like I've been doing on my own blog and, you know, i write for other places, et cetera. But I do worry that I'm preaching to the choir a lot of the time. Like I think a lot of the people that follow me on Twitter, you know, like already care about AI safety and probably already agree with me that it's a problem.
01:03:16
Speaker
And in terms of reaching beyond that to to people who are totally and the dark about this, I think it's very hard because there is a real risk that you will just cause this like kind of what's the word like nihilism or like something like that by just you know giving people another doom thing to worrying about to worry about and there isn't really a good call to action either like i think you can kind of be like oh well you should like email your mp about this and tell them you're worried or you maybe maybe you know you some people might want to go and protest but there isn't really a massive you know there's a bit of an ai protest movement but not a massive one um
01:03:51
Speaker
And I think people, you know, at least if you tell someone that climate change is a big deal, you can be like, well, I guess if you're worried about this, you can like recycle or use a paper straw or something. People sort of, whether or not that stuff actually works, people will feel like there's something you can do I think in the AI context, it's just, if I think it feels very disempowering.
01:04:09
Speaker
And so I don't have a ah great set. i think I think like I have some sense of how to make the arguments compelling about way why AI safety is scary, but I don't have a good sense of how to make people care or even the extent to which it would be good if they did because their emotional reaction might end up being very, you know, it might end up demotivating them.
01:04:28
Speaker
I mean, there might be some conclusions that are just too scary or too much of a downer to actually accept and, and, In some sense, we all have these psychological protection mechanisms that allows us to to focus on the the slow world and our our kind of world of of commitments and then and not constantly worry about...
01:04:56
Speaker
global events and perhaps dangers ahead in the future. And so, and and and maybe that's good to some extent, but it's just, yeah, there's ah there's a real kind of
01:05:12
Speaker
I have real uncertainty about what what the right move here is, because what we don't want is just to scare people and then they feel like they can't do anything. And then you've caused a lot of psychological suffering, but you haven't really helped the situation.
01:05:27
Speaker
um But it but in another from another direction, it's a very important principle that you are talking about the world as you actually understand it to be, right? That you are kind of relating what you found out in in an honest way.
01:05:43
Speaker
a person that i I think is doing this and a person but that's living very much in the fast, perhaps the person that lives in in the fast world is Dario Amadei. He's been the CEO of Anthropic. He's been recent. He's he's been...
01:05:57
Speaker
talking about very fast developments and issuing these stern warnings and so on. And I am actually, you think that the public is ah scott has has become numb to even statements from the CEOs of the AI companies?
01:06:14
Speaker
I think people are very skeptical of what a CEO of a tech company says because they have seen, like tech companies make false promises over and over again about what their

AI Warnings and Long-term Planning Amidst Fast Timelines

01:06:24
Speaker
technology should do. And then, you know, that doesn't end up happening I think it's definitely different because you know like some of the things that Dari Amadei is saying aren't really... you know This argument that's like, AI CEOs say that AI is going to kill the world because they are trying to create hype and they want you to think that their tech is really powerful.
01:06:42
Speaker
And I've always found this argument absolutely baffling because I just... Clearly, if they didn't believe... like it's It's not a good marketing ploy to say that your technology is going to kill everyone. but Obviously, but I do think that there's definitely like some kind of reflexive skepticism people have to anyone in like Silicon Valley saying my technology is going to automate or white collar work in like two years. like I can see why people kind of like are turned off by that.
01:07:08
Speaker
Yeah. yeah but But I wish people would just recognize that there's a difference between, don't like Elon Musk 10 years ago saying that like, Teslas were going to like be driving themselves everywhere in five years.
01:07:21
Speaker
And like Dario Amadei saying, hey, maybe like, there's going to be what was the phrase he used recently? Like a white collar bloodbath or something. It's something like this. for For entry level white collar workers, this is going to be a very tough condition. Yeah.
01:07:36
Speaker
Yeah, that's clearly like a very different claim. Like, yes, it's it's saying that the technology is going to be very capable, but he's also being very honest about the fact that that like might ah actually be a bad thing for a lot of people. And it's like, I don't think, yeah, clearly not in his interest to say this if he doesn't actually believe it.
01:07:50
Speaker
um But again, I think people, it's it's just a very difficult message, I think, to be like, oh, this is kind of like all those other times, but it's actually sort of different.
01:08:01
Speaker
Like, yeah. Yeah. What's actually required to change the world in a positive direction is is often these these kind of long-term projects, like writing a paper, like the papers we've been discussing in this conversation, or or writing a book, or starting a research group, and so on.
01:08:21
Speaker
ah doing a PhD, perhaps, in ah in a relevant subject. We are now getting to the point where it seems like timelines are perhaps, and again, emphasis on on perhaps, they're so short that some of these projects might not make sense. So, for example, a PhD in the US is, I guess, around six years.
01:08:41
Speaker
And and that's that's a long time if if we're in a fast world. how do How do you think about these impactful projects, and whether they make sense given short timelines.
01:08:52
Speaker
I guess like the best thing would be if we, as a community, kind of divided up between people who were gonna work in these like you know things that will pay off in two year timelines, and then some other people can go and work on things that will pay off in five years, and then you know all the way up to like multi-decade timelines or whatever.
01:09:10
Speaker
like that seems like it would be the best thing to do. But then from a personal perspective, I think very few people want to be the person that has to go and work on, you know, say the 10 year timeline project, if their timelines are actually much shorter than that, because that's just psychologically very demoralizing to so think that you're working on something, you know, even if you you think it's 50 50, whether timelines are more or less than 10 years, to work on something you think has like a 50% chance of being totally useless.
01:09:35
Speaker
is probably not very pleasant, like, psychologically. So it's kind of like, you know, i if I was like giving advice on a macro level, I'd be like, hey, guys, we really want some people to work on the the plans that will take longer to pay off. But then if I was advising an individual, and I was thinking about like, what should this person do for their own kind of motivation and like mental wellbeing, et cetera, then the question is way, way harder.
01:10:02
Speaker
i don't want nobody to be taking bets that are going to take longer to pay off. Right. But
01:10:09
Speaker
Yeah, that's not a very good answer. I actually really don't know. No, I don't know either. These are all open questions. i I tend to leave the really open questions to later in the interview where we can we can kind of like perhaps figure out an answer together.
01:10:23
Speaker
Another one of these is the question of whether... the playing field is kind of set at this point. If we are in very in ah in a very fast world, if you have and if we're in a world of short timelines, does that mean that what what actually matters is
01:10:41
Speaker
a small number of companies and we know who the who the players are, we know which approaches we have, perhaps we even know which techniques, technical techniques we're going to use to create these models and potentially align these models.
01:10:57
Speaker
is Is there also a sense that the that the fast world is also a world in which the the playing field is kind of set? Yeah, I guess I worry about like a self-fulfilling prophecy thing here where, mean, i again, have not been in this community that long. Like I think it was maybe it's spring 2023 is when I kind of started to get worried about AI. But what I've heard from other people who were here longer than me is that people kind of already thought this you know, like maybe five or six years ago, they would be like, oh, well, what we don't want to do is get the public involved. And we don't want to get governments involved. And we need to sort of handle this problem among the small group of technically minded people who are already bought into it.
01:11:38
Speaker
And they already believe that it's like, you know, like OpenAI is going to solve this problem. They're going to build AGI and they're going to align AGI. And like, don't tell governments that, like how worried we are about this, because they'll probably like,
01:11:50
Speaker
there'll be like a regulatory overreaction and don't tell the public because they'll panic, et cetera. And like that may or may not have been true like five or six years ago, but then we didn't get like a, you know, like a public wake up or like a government wake up until maybe two or three years ago.
01:12:04
Speaker
And so I just feel like maybe it actually is true now that but the playing field is set. But if we say that, then I guess what we're doing is like closing the doors to a bunch of other people kind of coming in and maybe having an impact like it sounds kind of cheesy, but maybe it's just like never too late or something. I don't know.
01:12:23
Speaker
And if it is too late, like, I guess there's nothing we can really do anyway, but we may as well act as if there are things we can do. And like, there are more people that we can bring in sort of like contribute to this conversation.
01:12:34
Speaker
i think just in general, like the broader the conversation is the better because then the higher the probability is of like someone having a good idea. um as ah As a final question here, do you have any tips for listeners who might be interested in living more in the slow world?
01:12:50
Speaker
perhaps Perhaps because they want to contribute to things that are helpful in the fast world. But do you have tips for yeah thinking about the techniques for for living more in the slow world?
01:13:04
Speaker
Yeah, I guess maybe spend like a little bit less time on Twitter. That's the thing you spend a lot of time doing. That's that's perhaps a a good piece of advice, just fully, fully, a fully general piece of advice.
01:13:17
Speaker
I think that's where what where most of the fast world energy like comes from. I guess like not trying to, like, I think a trap that I fell into when I like, when I first got worried about AI safety, like it was actually just the thing that I was like genuinely very anxious about.
01:13:31
Speaker
Like now it's the thing that I enjoy and find inspiring, like an interesting, and I've met all these people through it and it's kind of fun in a way, a weird way. But before it was just like an anxious fixation of mine where I was like trying to figure out like, ah what are the timelines actually? And like, how risky is AI actually? Like what what actually is the probability of like,
01:13:49
Speaker
everything going super wrong and then I would like try to read as many people's estimates of what timelines were as possible and I would like try to come up with my own estimates and blah blah blah and just like try to get more and more information and like you know like sometimes this is kind of helpful to ground yourself and develop a bit of a high level well-being or whatever um but like Yeah, trying to estimate more and more precisely what timelines are, like, is actually, you know, pretty counterproductive, I think, if you do too much of it. So just like not just being like, accepting that you will not find an answer to this question.
01:14:23
Speaker
And that like, there is always going to be uncertainty and that you should just kind of like, you know, accept the possibility that maybe things are happening soon or maybe they're not happening soon. um And then I think just like, I don't know, try to create like a really deliberate separation between, don't know, if you actually work on AI safety and that's thing you do in your professional life, then like actually just the traditional thing of trying to have work-life balance just kind of applies here.
01:14:47
Speaker
And like thinking about this really clearly in terms of separation of time. I don't know, like if you work nine to five and you're doing that, you know, you're doing AI safety, like actually after five, when you close your laptop, maybe like don't try to absorb a bunch of the AI safety discourse in your spare time.
01:15:02
Speaker
I mean, I do still do this, but I try to carve out little like, you know, specific hours in my day where I just am doing something completely different. um I try to curate particular social media feeds to not have to do with AI safety.
01:15:15
Speaker
I don't know. i'm I'm not recommending people spend a lot of time on TikTok, which is what it's of sounding like I'm about to say, but I have a whole TikTok feed that is just not about AI and it's really delightful. Although even now I've started to get AI related stuff on my TikTok, but you know, I tried to keep it mostly about wholesome, nice things ah like people just going to bakeries and like trying cakes and stuff. And it's just nice.
01:15:34
Speaker
Yeah, I think it's just the traditional things of like trying, yeah, trying to reserve time for stuff that's totally unrelated. Yeah, that's that's that's great advice, I think. Sarah, thanks for chatting with me. It's been great.
01:15:47
Speaker
Yeah, no worries. Thanks for having me.