Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Tom Barnes on How to Build a Resilient World image

Tom Barnes on How to Build a Resilient World

Future of Life Institute Podcast
Avatar
4.6k Plays3 months ago

Tom Barnes joins the podcast to discuss how much the world spends on AI capabilities versus AI safety, how governments can prepare for advanced AI, and how to build a more resilient world.   

Tom's report on advanced AI: https://www.founderspledge.com/research/research-and-recommendations-advanced-artificial-intelligence   

Timestamps: 

00:00 Spending on safety vs capabilities 

09:06 Racing dynamics - is the classic story true?  

28:15 How are governments preparing for advanced AI?  

49:06 US-China dialogues on AI 

57:44 Coordination failures  

1:04:26 Global resilience  

1:13:09 Patient philanthropy  

The John von Neumann biography we reference: https://www.penguinrandomhouse.com/books/706577/the-man-from-the-future-by-ananyo-bhattacharya/

Recommended
Transcript

Introduction and Guest Welcome

00:00:00
Speaker
Welcome to the Future of Life Institute podcast. My name is Gus Stocker, and I'm here with Tom Barnes. Tom is an applied researcher at Founders Pledge. He's also an expert advisor to the UK government on AI policy. And he's the author of the new report, Navigating Advanced AI. Tom, welcome to the podcast. Thank you so much for having me, Gus. It's a real pleasure to be here. Fantastic. Okay.

Funding and Investment Imbalance in AI Safety

00:00:24
Speaker
One conclusion you reach in your new report is that you write, for every $1 invested in making AI systems safe, $250 is invested in making AI si systems more capable. So how how did you arrive at that ratio?
00:00:41
Speaker
yeah Yeah, it's quite a concerning ratio. I think this came from looking at trends in philanthropy and in private investment in AI through 2023. So I think in 2023, we saw about $100 million dollars spent by philanthropists. And yeah in comparison, something like $25 billion dollars spent on building generative AI systems. Of course, there are caveats to that number. yeah um Some would argue it could be slightly lower. I think if anything, it's a conservative estimate. And now in 2024, and looking 20, 25, and beyond. beyond People are talking about hundreds of billions of dollars, maybe even trillions of dollars spent building these systems. Meanwhile, in the philanthropic space and in the general safety space, things have really stagnated. If not declined, if you think about OpenAI, we're committing to spending 20% of their budget on yeah super alignment, which is now no longer seems to be the case. And so yeah, I think at best, there is $1 spent on safety, so 250 on capabilities.
00:01:34
Speaker
And so so is this ratio arrived at only by looking at philanthropic investments into AI safety research versus investments into AI ah companies? Or do you ah do ah could you also look at how companies are spending their money? You mentioned OpenAI's commitment to spend 20% of their compute on super alignment, and that is that this commitment is is uncertain now. But would you also guess that the ratio is 1 to 250?
00:02:03
Speaker
if you include money spent by the companies building AGI? Yeah, it's a good question. Unfortunately, it's not super clear how much they are spending and and even the division between safety and capabilities is like controversial. I mean, I think if you look at the staffing of, you know, like say DeepMind and Dropbox, they have like a substantial staff number on safety. I wouldn't know exactly but I would guests on the order of 10 to 20% of staff. But then OpenAI, it seems much smaller more recently, many of them have left. And many other companies at the front here simply really do not have a team focus on ah frontier risks from AI. So yeah, i would I would guess it's on a similar order of magnitude, perhaps 100 to 1 on the terms of the the company's own investments.
00:02:45
Speaker
on the frontier but of course there's so much more being invested in in many different parts of AI and then on the safety side of things outside philanthropy it's really quite small. Governments have started to make some commitments so the UK government committed to £100 million back in 2023 for the AI taskforce, now the AI Safety Institute and a couple other sort of commitments have been made but still we're talking much much smaller than the capability side.

Challenges in AI Safety and Capability Overlap

00:03:10
Speaker
Do you think the 1 to 250 ratio also holds for talent, so employees working on capabilities versus employees working on safety? Yeah, again, I'm not totally sure, but I would guess on that order of magnitude, if again, if you look at the amount of people working on these um sort of issues in in companies, I would guess similarly, yeah we see the same on talent there. So yeah, we're we're talking like ah two orders of a magnitude at least. I think maybe even three, depending on your view, it's like your particular talent. And so if we include like non-frontier AI research as well, certainly it's yeah even more in that direction.
00:03:46
Speaker
Yeah, one complication here that you actually mentioned is that it's sometimes difficult to disentangle safety work from work on capabilities. Perhaps a good example here is that reinforcement learning from human feedback makes models more useful and helpful to people. And this also helps in commercializing these models and increasing revenue, and it makes them more capable and in a certain sense. So even though reinforcement learning from human feedback was originally conceived as a sort of alignment technique, it turned out to also increase capabilities, one could argue.
00:04:23
Speaker
Do you think, this is is this a general trend across ah safety techniques? Could it perhaps turn out that our work or the work that's been done on interpretability turns out to increase capabilities also?
00:04:37
Speaker
Yeah, I think it is a messy topic. I think interpretability, for example, could be, again, something that is very helpful to know what your AI system is doing if you want to make it safe, because you want to know if maybe it's deceptively aligned, maybe it has scheming in some sense. But you also want to know what it's doing because you have commercial applications to it. You want to know if you're really understanding particular concepts in whatever sector you're deploying it in.
00:05:02
Speaker
as though it's inherently a dual use sense or a capabilities and safety sense. And I think the real important point to take from that is not that this is necessarily bad, net negative, although it could be, but more that like on the margin, this is just like not the most useful thing to do. There are some really important topics that are much more neglected that don't enhance capabilities, but are important for safety. So you're fundamentally like.
00:05:24
Speaker
AGI safety and concerns about power-seeking AI, concerns about you know foundational, theoretical work on whether AI systems will yeah converge on certain goals. like Those are questions which don't really have a commercial application in the same way that something like RLHF does, but it's fundamentally an essential part of being able to align like superhuman systems. so yeah We really sort of in the report focus on what are the companies likely to do and are doing and what are they not doing.
00:05:52
Speaker
And what's what's an example of something that companies are not doing? Yes, so I mean, even with interpretability, we do see anthropic doing this to some extent, deep minds to some extent, but it's pretty limited. so And then in the real sort of like niche parts, more theoretical work, I think, but if you think about like previously agent foundations was a keep particular focus of the safety community, which seems to have fallen by the wayside. And more broadly, just thinking about like how goal directed systems might consider taking, you're seeking power,
00:06:21
Speaker
I think some research does take place on this. For example, at Redwood Research, they do like yeah work on control, but this is still not in the companies. And so those are the sort of areas where I'm particularly concerned that there's a missing space. You also mentioned in the report that work on inner misalignment might be lacking at AGI corporations. So what what is what is inner misalignment? How does it differ from outer misalignment? Why is it not funded?
00:06:50
Speaker
Yeah, so I think this taxonomy comes from Miry and some people might disagree about this sort of typology, but broadly anonymous alignment is understood to be a situation where the AI, you know, we try and align its particular sort of goals, but when it like generalizes, um it might do so poorly. So it might ah misunderstands the goal we gave it. And you've all addressed it maybe ultra alignment, which is the question more of value alignment. Is it doing the thing that is truly in our best interest? At least this is my interpretation of this split.
00:07:17
Speaker
So on the inner side of things, this is more of a technical question which we don't really see right now and in the same way Azure alignment is more visual and more visceral as a concern. And so again, we might expect yeah commercial incentives to so solve this problem, quote unquote, like one can, more likely than in inner issues like goal misgeneralization and I think like deep deceptive alignment issues are more likely to occur on that sort of bucket.
00:07:40
Speaker
So yeah, that would be, if we're going to split sort of the alignment space in those sort of two halves, I would really think the inner side of things is yeah much more neglected, much more concerning, and perhaps harder to work on, but also like yeah where the real threat comes from.
00:07:54
Speaker
Although if you if we imagine that some corporation trains a system and doing testing, it performs as expected, but then doing deployment, it performs in ways that are on ah unexpected by the company. And so that's perhaps an example of ah inner misalignment. isn't that Isn't it in the company's interest to prevent something like that before deployment? Isn't it the case that the companies have commercial interests ah to also do inner misalignment?
00:08:21
Speaker
but certainly is a commercial case for doing so. And yeah, we were told that like yeah there was spend resource based on the outside and in a misalignment simply because yeah if it's deployed poorly and initially performs well on on testing and training, but then later has failures, then we would hope that they'd also invest in this. But I think just empirically, we see that like companies are spending most of their time on sort of scalable oversight methods that are primarily out of focus, and less thinking about like these more theoretical concerns post-deployment. And then of course, I guess with like truly superhuman systems, yeah maybe Fed is in deployment up perhaps too late to repair. So yeah, I think that's like a real area where it might be too late. And so maybe the difference is not so substantive between these two outer and inner and in terms of incentives, but I nonetheless think that inner is more likely to be neglected.
00:09:06
Speaker
yeah so so The classic story of why companies are not investing in in safety as much as they perhaps should is that they are in a race with their competition and their incentives are to spend theyll spend most of their capital and most of their talent trying to get ahead.
00:09:24
Speaker
and they're therefore they're incentivized to cut corners on safety so when you did this this in-depth report did did you find evidence of of this kind of classic story are there any perhaps the points we're missing when we when we lay out the story like i just did Yeah, I think the strongest evidence to be honest is just like the words of the people running these companies. it's Many of them say we believe that you know AI could be yeah catastrophic or existential. I'm thinking about like Sam Ammon said that this could be like lights out for all of us.
00:09:55
Speaker
dar i i has said like yeah kind of exactly his p dom but it was you certainly about ten percent maybe twenty five percent um that order but you know despite that these companies are bracing had to build these system so they I think the only explanation for why one would rationally do this, if they really believe these numbers, is concerned that maybe someone else will do it instead. So even in a sort of safety-motivated context, they are sort of racing ahead to try and prevent the other one from causing this harm. And I think this is like a classic Princess Dilemma situation where one is thinking, well, if I don't do it, it will be worse. So yeah, I think you can infer a lot just from hearing what these companies are saying and these CEOs, if we take them at their word, of course.
00:10:35
Speaker
um Yeah. And then we have perhaps some some tools from economics and and game theory for trying to solve and a coordination problem like this or prisoners dilemma problem like

Government and Regulation in AI Safety

00:10:46
Speaker
this. But it it doesn't seem like it's being solved. That is at least it's not solved yet. Why why isn't why isn't that happening?
00:10:53
Speaker
Yeah, I think if we look at like sort of yeah classic cases of SM previously and in other parts of economics, and also political theory of precious dilemmas, all that's finding the solution is is government intervention, is government regulation. This often like solves many of these coordination problems. So on the domestic level, you're requiring companies to meet some kind of minimum standards to to not have externalities. This is like quite a common thing across many, many different sectors. So on the domestic level between these different companies, there is like this question that like only a government can really answer. Of course, like companies can informally make commitments to each other and meet and and try and coordinate, but most often we see that ultimately it's government that has to step in. And then the second side to this coin is, of course, on the international level. So again, we have US and China who are also in this kind of competitive mindset who are already thinking that they have to be
00:11:43
Speaker
the other, they have to build this technology first, even though they both also really think this is a threat. So we recently saw in China, you know, suddenly a few scientists coming out and saying they're concerned, you know, reports of the CCP internally, you know, really being concerned, and the counter often being, but we have to beat the US, which is ah exactly the parallel of what we see in the US. so Again, the same situation here, a presence dilemma. And again, the solution is often its national governance forums, dialogues, summits, treaties. I think there's any way we're going to avoid the same kind of racing dynamics.
00:12:14
Speaker
So perhaps this would be my summary of the situation. that The governments, say in the US and China and the UK and so on, they perceive that it's perhaps not in their interest to regulate their companies strongly because that would that would hamper than them in the international race between countries.
00:12:34
Speaker
And internationally, we don't really have an organization or or an institution that can enforce treaties or agreements between countries in ah in a very strong and and substantial way. So that is perhaps the somewhat tragic situation we we find ourselves in. what Do you agree with that characterization?
00:12:53
Speaker
Yeah, I think on the is international stage, it is much more concerning. There is no kind of overarching governance body in the same way we have with on the domestic level. But I think there are still some clusters of hope. So we have had the AI Safety Summit in the UK, and then Korea, and one coming up in France. And although the focus on like frontier safety is still what's down a little, it it is being discussed. And I think yeah I'd love to move towards a world of like yeah the UN or and some other kind of international governance fora, where these companies would come together, these countries would come together, sort of see this as like a collective project. You know, I think this is somewhat idealistic at the moment, but there are still steps that can be made right now, which can really address this. And I think in particular, focusing on that US-China relation, the really positive dialogues between sort of AI safety researchers in the US and China who are sort of doing this outside of government in track two dialogues. So I think that's a real sort of number of hope.
00:13:45
Speaker
One problem with safety work at companies is that it perhaps over time tends to transform or be replaced by work on PR for the companies and making sure that you could call it making sure that the companies are are perceived well by the public as opposed to tackling fundamental issues of alignment and and control over these systems. Do you have an explanation for that? Why why might that be happening?
00:14:12
Speaker
Yeah, I think we certainly do see it happening. I think Joukowsky says something like, we should only kill it. AI not kill everyone-ism because it is the only way that we can make it clear what is being meant here. But ah yeah, I think there's certainly safety washing happening in the companies and in government. Like I mentioned previously about the AI safety summit in the UK that's now become an AI action summit in France. And so yeah even subtly, you can see this like focus on safety and frontier safety being slowly eroded away.
00:14:40
Speaker
I don't think there is a real good way to combat this other than to sort of call it out and just say like when there is like this backsliding happening and really sort of question people what are they talking about when they say safety.
00:14:51
Speaker
So your report, the focus there is to find out what philanthropists can can do with their money in order to support work on AI alignment and AI safety in general. To what extent do you think it's possible to work outside of the major ADI corporations on these safety issues? I'm thinking in in in small research organizations, how effectively can can such organizations contribute?
00:15:14
Speaker
The story I sometimes hear is that it's difficult to contribute outside of ah the corporations because you need access to the models that are most advanced, you need enormous compute budgets, and this this discrepancy between ah resources at the corporations and and resources at, say, research organizations, it it's only getting larger over time.
00:15:38
Speaker
Yeah, I think it is a concern and I think one needs to look at like where the levers of mitigating these risks are. The first is definitely these companies themselves, you building up alignment techniques, and thinking about sort of safety cases. But the second is also governments more increasingly. So we have the AI Safety Institute in the UK, also now in the US, and several other countries. So um'm I think if one wants to work on sort of frontier safety, they don't necessarily have to work in these companies, although they do have these large budgets.
00:16:05
Speaker
And then I would go back to the point on think about like the commercial incentives for the work done in these companies. So they'll be working on RLHF and scaling alignment and similar techniques, but they won't be thinking maybe about more theoretical work on work that'll work for only super intelligent systems. And so again, the one can I think working, yeah, and it small groups can be really, really quite effective here. And Yeah, I think certainly it would be a mistake to just focus on the companies, also infinilize all the working AI policy and AI governance that's essential here too. And so, yeah, I would see only like companies as one small part of this problem and increasingly focus on government.
00:16:41
Speaker
Yeah. And maybe maybe we should take a step back and think about why are we even talking about AI safety and alignment? This is because we perhaps foresee some risks from AI. And my listeners will have heard the the case for why advanced AI could be could become dangerous before, but I think it's and it's important to to hear it from different angles. So yeah, what do what do you think are the key risks from advanced AI and how would you how would you rank them in in terms of importance?
00:17:11
Speaker
in the report we just published, and we go through a few, which your listeners will probably be familiar with many of these. But I think like number one, this like concern about like misaligned AI systems, power seeking AI systems. So what do I mean by that? I mean systems which more advanced the humans um in effectively every domain, whose goals are not aligned with it.
00:17:31
Speaker
the goals of their creators, so they have different intentions and they have a different propensity to their creators. And in pursuit of those different goals, they want to you converge on particular instrumental goals. So if you wanted to seek power through collecting money, through resources, through you know yeah acquiring you know more capabilities themselves and and maybe having some kind of feedback loop.
00:17:52
Speaker
So this instrument instrumental convergence case, which has been made so many times before, means you have the systems both more powerful than you and having intentions misaligned with yours, which yeah know if they execute on that, would essentially leave humans in the dust. I want to like caveat that this does not necessarily mean just extinction. It could mean that ai is left humans left extincts from AIs, but it also could mean that ah humans are left alone. They could just be essentially unable to affect the long-term future.
00:18:19
Speaker
And even if humans are fairly happy and and healthy on Earth, they will lose meaningful control over the long term future. So it's not just about extinction, although that is the central case. So that's like the one that's most classically

AI Risks and Security Concerns

00:18:32
Speaker
known. But I think more recently, we've seen a lot of discussion about like misuse of AI systems. So for example, AI systems which are capable of maybe assisting people building weapons, biological weapons, chemical weapons, AI systems that can be used for cyber attacks. And I think that the concern, if you're especially concerned about extreme risks here, is that this can lead to catastrophe, particularly in the case of pandemics, because that can spread sort of, you know, And then I think there's like this other sort massive like so broad tent of what I would call like democratic and distributional risks. So on the one hand, we could have a scenario where everyone has access to AI systems, and fully open source, which could cause catastrophe. um That's like one extreme that we want to avoid. The other extreme is that only one or a small number of people like have access to these systems. So they're super powerful. They can accumulate vast numbers of wealth and power. And that sort of yeah is is inherently concerning for many people's more systems. You can see how that could lead to totalitarianism, to yeah all sorts of just
00:19:30
Speaker
and then i'll i'll so I'll say a couple more, I guess epistemic insecurity is another one that's often talked about. So today we talked about misinformation, disinformation, and I think if you take that to an extreme, you can imagine scenarios where you people really don't know truth or fiction, and that could be a massive risk factor for potential global pastures, pandemics.
00:19:48
Speaker
et cetera. And then finally, I think, and one that isn't talked about enough in my view is this like digital minds and what really happens if we build these AI systems that you have some kind of moral welfare, moral value that we forget to consider when we're expecting all these more sort of human centered risks. So that's what we're talking about in the report. I'm happy to go into sort of particular ones. In my view, the the most concerning right now is probably the power seeking AI that misalignment case, but yeah, there's quite a few there.
00:20:15
Speaker
Do you think these risks will fall on a kind of intuitive timeline? that And by intuitive timeline I mean that perhaps first we might see risks from misuse and only after we see risks from misuse we might see we could call it rogue AI or power seeking AI. Does that make sense or will things be happening so fast that that perhaps these risks overlap in time?
00:20:39
Speaker
Yeah, I think it is unclear like the pace of this technology, which unfortunately makes this hard to say. I mean, I think I would rock it into three parts. So risks from AI systems below human level, risks at human level and risks above human level. So already at below human level, we see yeah AI use for low level misinformation.
00:20:58
Speaker
maybe an election interference. It's already used for like creating imagery that's like sexual exploitation, those kinds of concerns. and We already see that today. And if you're particularly concerned about catastrophic risks, you could imagine like some of these you know just being scaled up in terms of extremity that could be directly concerning. And then I think in the sort of around human level, we might see those misuse risks like misuse of cyber, misuse of biological weapons. And then I think that the above human level, that's where you kind of get to the super intelligence risks. Now this could all happen over a course of decades. It also could happen over years, months even between these three stages. So yeah, that's why a key focus of ours is to sort of be pursuing interventions that work across all these risks, because we really don't know for confidence in how likely they are, when they'll occur. So it's best to pursue policies that can yeah really mitigate across all of them.
00:21:50
Speaker
I think that's key. I also think perhaps cyber attacks using AI is already happening or it's at least in on the horizon. A large language model, for example, might be helpful in trying different ways to to hack into a system. And we could imagine we can imagine cyber attacks on critical infrastructure and suddenly you have quite large-scale risks that seem plausible to me and not far-fetched and it doesn't require any extrapolation ah of AR capabilities, at least not any extreme extrapolation of capabilities. So I think in some sense the risks are already here, but of course there's a large ah difference between talking about ah using large language models for cyber attacks and then
00:22:37
Speaker
talking about a group of agents or one agent that is that is misaligned from from human values and is more intelligent than we are.
00:22:48
Speaker
the Yeah, I think with that, there's certainly a difference of magnitude rather than a difference in type with many of these risks. Like the cyber case, right now it might be humans using LLMs, but maybe in future it's the agentic AI systems being multimodal and and specifically many different yeah pathways. So they're different types of AI, but fundamentally the risk sort of is the same on the same scale. um And hence why that often the solutions can be on the same scale as well.
00:23:13
Speaker
But can the solutions then can the solutions work for something as as ah comparatively simple as as low-level misinformation today and on the other end of the spectrum, say, misaligned superintelligence? I'm thinking in terms of what type of solution could work across the whole spectrum, say, from low-level misinformation that we see ah today, ah generated by AI all the way to, say, misaligned super consultants. What type of solutions could but cover that whole spectrum?
00:23:45
Speaker
Yeah, I think the closest thing that comes to my mind in terms of like truly threat agnostic interventions are often the most boring interventions at all. Actually, there are real basic things that we could be doing. So one example is setting up whistleblower protections for people working in these companies, whether that's someone concerned about misaligned systems being built or just something about paying news for deep fakes.
00:24:06
Speaker
Similarly, you could set up basic protocols, processes in terms of, for example, let's say you've built a concerning AI system, whether that's concerning for misinformation or because it's misaligned. Fundamentally, that information maybe is started by the AI company, but that has to escalate somehow. yeah Somehow, they have to inform maybe the government. The government has to internally organize like who is responsible for responding to that. yeah How does it get to the present? a.m. does it need to be woken up at the middle of the night? Is it is it that kind of risk? Like there are the really boring governmenty process questions and they really matter no matter what the risk is and they're really sort of missing right now. And then I think finally I'll just mention evaluations is a common focus right now and again you want to know how capable your AI system is no matter what that harmony is. So those are just some that come to mind that I think a pretty low hanging fruit and are pretty like important no matter what the risk is you're focusing on.
00:25:00
Speaker
Perhaps one other such solution that comes to mind is securing AGI corporations, both from the from from outside threats and from the AI systems escaping, you could say. So when I'm talking about securing AGI corporations, I'm i'm thinking about physical security at the headquarters and at the at the data centers of ah of these where the models have been trained. But also, cybersecurity and trying to prevent anyone from from stealing the model weights. But also, perhaps the other side of that coin is to isolate the the systems that are working at at these corporations so that the systems can, for example, get onto the internet or you know escape in other ways or replicate themselves in other ways.
00:25:50
Speaker
that is That is slightly less low hanging and we we don't have ah an exact idea of how we could do that. But something like cyber security and of course fi physical security is something that I think we know how to do. Would you agree that this is this is a this is on in the same bucket of somewhat low hanging fruit?
00:26:08
Speaker
Yeah, absolutely. I think it is certainly low hanging fruit. And I think it's certainly one that really applies to a whole range of risks, like you say. So yeah, the risks that maybe an attacker wants to get into an AI system and are the risks that an AI system wants to get out into the world. I think like the same case, you need to have like good interest security, good cybersecurity.
00:26:24
Speaker
and I think there's a bunch of lessons to learn here from you other sectors. Governments and so companies and academia often have these kinds of concerns. right and ah We have BSL4 labs for for you know doing biological research, although there is a question there about how secure these are. Nonetheless, like there exists at least some protocols in the past to for for these kind of concerning areas so one thing i would say is like we don't really need to reinvent the wheel with many of this stuff we don't need to come up with entirely novel infrastructure you know on cyber security there's a whole cyberspace to explore physical infrastructure as well so
00:27:00
Speaker
Whilst this threat is somewhat novel, the solutions don't have to be super difficult. um And I think that is also important on the politics and the policy side of things too, when people think, is this threat really real? Is there anything we can do about this? Why are we talking about AI when there are other risks? I think the answer is often like, well, these are the same kinds of solutions, so we can often see them and not intention with existing other focuses.
00:27:23
Speaker
Do you think the type of or the level of security we see with biological labs and and say nuclear weapons, is that good enough for advanced AI? I'm thinking in terms of we've seen a bunch of near misses with nuclear weapons. We've seen a bunch of of failures to secure biological labs. We have imperfect security with both biological labs and nuclear weapons. Can we have such imperfect security with advanced AI?
00:27:49
Speaker
Yeah, although I don't know ah so much about the bio nuclear cases, i'm my impression is they're not perfect. They're certainly not gold standards and there are these NMS cases, there are these sort of tele-accidents and like ah it's separately to AI, I think people should really be thinking about how to close those loopholes. But I think nonetheless, like you know it's it's bad, but it's not nothing and when we can learn still stuff and ah from from those domains and think how it would ideally be applied in those places and think how it would ideally apply in the AI case as well.
00:28:16
Speaker
Okay, great. So we talked about earlier how the solution we might we might use to prevent an arms race or prevent a a race ah between companies and countries is some form of government or or international regulation. In your report, you conclude that governments are not currently prepared for AI risks. Why is that? and And perhaps you could give some specifics to give us a sense of of of how prepared they are and and in what sense they're they're preparing.
00:28:45
Speaker
some of this is informed by my time in the UK government as well over the last few months, but I should caveat, I can't go into like super detail on, there's definitely all the ways that we're not ready because I think that would be bad information to disclose. But yeah, I mean, I certainly in broad strokes, I think you could just look at the resourcing focusing on that question. um Like the AI Safety Institute and the UK, for example, has grown quite a lot to 30 people, and which is fantastic. All these people are evaluating these systems, but I was on a team of three people who were actually thinking, yeah what is the plan if we found a concerning capability? When I joined, there was one of us, and then I was the second person working on that. So yeah, I think that gives you a sense of scale about if there's three people in the UK government thinking about what do we do if we find concerning issues with AI systems. like
00:29:29
Speaker
I hope that's sort of like, it's concerning enough, but I guess I could talk in particular details at some level about yeah a lack of processes, a lack of clear protocols, lack of testing, lack of ways to anticipate these risks. And so yeah, I think there's like a whole bunch of sort of gaps.
00:29:45
Speaker
which we are like working, hopefully, to plug. but And this is in the UK, and the UK is seen as a head on this issue compared to the US, so I think probably the US and other countries are are behind that. um And I think, like yeah as someone who only joined us as a conti part-time and ended up working on this stuff, it it really should have concerned me that like that was the case. I really hope there will be established institutions on that. I mean, maybe the the slight caveat or the so the positive side is There are existing crisis protocols, emergency planning in places, everything from flooding to yeah pandemics already. And there is some level which AI risks can dock into that infrastructure. But nonetheless, the number of people thinking about this is like very, very small.
00:30:26
Speaker
yeah so that's That's the number of people thinking directly about AI risks and and you using the the terminology and the conceptual frameworks that that that we that we also use. But I'm wondering, there is a whole kind of national security establishment, both in the UK and the US and basically all countries, where some questions of AI risk might might fall into their into the the kinds of risks that they handle. Do you have any sense of of whether the national security establishment is is aware and and whether they're taking these kinds of risks seriously. Yeah, so I think there's certainly broader awareness that these are risks, and I think... Yeah, there there are a positive signs on that angle. The one sort of negative to that, of course, is like that does not mean that just because the people are talking about it and thinking about it means it actually is going to be solved. I think the analogy I would give is if you look at like the COVID response back in or February, ah March 2020, you know, pandemics was talked about loads in the national security communities in you know many, many parts of governments, UK, US, others.
00:31:30
Speaker
And yet we still had a pandemic and I think you know those institutions failed like dramatically. So yeah, it's one thing to get them to notice this is a problem, but it's an amount of massive challenge actually you can get them to solve this problem. And so I would not take comfort just from maybe some people thinking about it, because especially given the complexity of AI systems and the new threat and the lack of government expertise on these problems, that's where so my real worry comes from.
00:31:55
Speaker
There was a moment right after or somewhat after the launch of chat GPT where there was a lot of attention from world leaders, from governments on on AI progress and and also AI risks. Do you think do you think the government is is limited in its attention span and perhaps with kind of political election cycles suffers from short-term thinking, are governments keeping the focus on on these issues?
00:32:22
Speaker
Yeah, I should caveat this that this is personal opinion only, but yes, no, I think that really is like short sightedness here. I think you're right. that Yeah. Following chat GPT, we had the FLI letter of course, and then subsequently many sort of, you know, much talking about regulations and safety and certainly in, yeah, 2093, we saw the establishment, the AX18 Institute is shoot it's very exciting, but you know, priorities change and then there's you know there's other risks in the world and and governments have to divert their attention. And now I think we really do see that sort of starting to happen. AI is still like you know a very hot topic in general, and but most of the conversation is, you know how can we use AI to you spur innovation to get economic growth to make public services work in the UK? And I'm sure in the US, yeah, depending on the outcome of the election, we'll see sort of how that debate goes, but it's not clear that either side is but specifically focused on frontier risks from AI.
00:33:12
Speaker
Having said that, I think there are positive winners in terms of the ah level of agreement on this issue. so Even if it's not on the top of everyone's agenda, there's a widespread agreement that like this is a concern. ah say like the The Conservative Party was the one that was in power when they joined, and then the Labour Party is subsequent to come in. Both have continued to support the AI Safety Institute. Both have talked about regulatory options for AI. and so and The other concern is that we have some kind of warning shot, some kind of event happens where these AI issues materialize and what's that would be tragic. I mean, there's hopefully an opportunity there to really put this back on the agenda and really say, no, these risks are serious, they're real, and then subsequently get the interventions needed. I guess if we look from the politician's side or from the from the perspective of of a world leader, you're faced with an enormous array of potential issues. There are experts warning about all kinds of risks.
00:34:04
Speaker
And maybe there's some some reasonable reasonableness in in being somewhat short short-term in your thinking because the situation is changing so fast that you have to respond to what's in front of you. There are several ongoing wars in the world right now. And so when I when i speak of of the government being somewhat distracted or or engaging in short-term thinking,
00:34:27
Speaker
it's not that I can't see that that might be reasonable, it's just that, of course, we are we are interested in holding the government's attention on on the risks from AI. Yeah, I think that's right. And yeah, politicians have to rush off, you know, sometimes hour to hour data to hail them sort of the scandal and the news cycle. And like I would not expect them to be on top of every risk over the next 10 plus years. Having said that, like, I think the but level of severity of these risks really needs to be solved. Like,
00:34:52
Speaker
brought home. um The other concerning side, I think, of this is not just the politicians, but often the the civil service in the UK and the bureaucracy behind them also suffers from the same institutional problems. So short termists lacking the right expertise. yeah Many people in the UK government will move between different sectors. you know Every year they may spend some time on AI, but they'll move on to this lacking depth of specialism there.
00:35:16
Speaker
and so yeah I think one thing I hoped when I would join the government is that there's some secret kind of backdoor they show you and it's like, this is where the real experts are working with, you know, national security risks. And I think like, frankly, there's not, there really isn't. And so, yeah, worse we can sort of like, yeah there are issues with politicians not talking about this, but there's also wide institutional failures here. Do you think the government or do you think governments in general have trouble attracting a technical talent? And do do you see this changing with the new AI safety institutes?
00:35:47
Speaker
Yeah, I mean, I think it's a massive challenge. And I think the AI states and institutes are like incredibly rare and impressive that they have been able to attract such like brilliant talent. And it is extremely unusual for for government. Most of the time, you know there would be paid people in those experts who are paid far, far less than industry. And they're also still underpaid compared to what they could be earning.
00:36:05
Speaker
It's not quite such an extreme disparity. And also in terms of, you know, standard bureaucratic rules and processes around hiring can really sort of prevent some of the best people getting in. And you really sort of like see the knock-on effects of that, where policies are discussed, regulations are discussed without, so you know, technical grounding. And so, yeah, I think it's a real real concern, but I think, you know, credit has to go to the ASH Institute in the UK and the US and other places to, you know, try and push back against that tendency.
00:36:31
Speaker
When you're advising the UK government on how to prepare, what what is your main message? what do you What do you ask them to focus on? I know if you've written about resilience, what does that mean and and how do you bring that up?
00:36:44
Speaker
Yeah, absolutely. So with the preparedness and resilience work in particular, I'll go back to like, it doesn't really matter which particular risks you're concerned about, nor how likely they are, or even how like, yeah when they're likely to arise, the key message is just like, there are basic, simple policies and and actions and protocols we can set up now, um which yeah are pretty costless. They don't yeah harm innovation, they don't decrease the chance of you know growth or capabilities, but they really do help. They they are basic things. Like I mentioned, yeah when should the Prime Minister of the President be called at 3am? There should be an answer to that question. There should be a plan. like What are we going to say to the company if they build a dangerous system? Are we going to ask them to delay their release? Are we going to ask them to delete the model, shut down the model? like These are basic questions. And whilst there's technical questions too about like how you concretely do those things, there should also be just a basic process thing about who reports to who and how did that information flow. So I think like, again, that's sort of the pitch is like, this is quite simple and costless. And also that this is is like standard, you know, I say pandemic preparedness is very similar and we should have the same kind of mindset with like AI preparedness. One thing we haven't mentioned is wargaming exercises. How could doing such as as such exercises help?
00:37:55
Speaker
Yeah, I think so in several ways these can help. So one is just like making visceral to senior officials and politicians what we're talking about, that they can really sit down and feel that weight of responsibility. You know, if it's 2026 and they are responsible for, you're dealing with maybe, let's say the opening eye has built an AI system.
00:38:13
Speaker
It's tested as like dangerous on cyber or fire risks. And yeah we know this and they're planning to race it in two weeks. Like what do you, the politician do? Like what are your next steps? Really sitting them down and thinking about that. I think that's a real sort of like make them feel at the risk among other things. And then I think there's like direct learnings from that. So you can look at like what would we ideally do in that kind of scenario? What would we be yeah doing and what would we currently be doing? And then you notice the gap and there is a gap and then you can hopefully close that gap through sort of various means.
00:38:43
Speaker
How do you conduct this kind of war gaming exercise, practically speaking? What is it? is Are people looking at a screen? Is it a table with papers? How does it look? Yeah, it is it's very exciting, I think, ah at least from my perspective. I mean, I think generally how these can be conducted is quite basic. It's a PowerPoint slide, a list of you know scenarios, a list of actions to walk through and say a challenge and say, but what if you you try and call the company but they don't pick up?
00:39:10
Speaker
What if the company is based in China rather than the US? What if you can vary the scenario in various ways, and you want that's scenario that oh a plan that both like works for the scenario, but also is robust to many different scenarios. So you don't want to index only on, say, a US company. You don't want to only focus on one risk, and you really make sure that's what I think resilience means. It's like being prepared no matter what the risk, no matter where it's based, and by sort of stress testing different plans, you can hopefully build stuff.
00:39:36
Speaker
It is interesting how how we need to practice these kind of low-level practical and logistical logistical ah details of of how we might respond to an AI crisis. But when you look in and hit in for historical cases, we do have, I'm thinking during the the World Wars and during the Cold Wars, you you might want to respond to to a a crisis during the Cold War as a government, but then the person responsible for implementing that that response is is not reachable.
00:40:06
Speaker
And maybe that's not a problem to some extent in our world where where we have so many ways of reaching people. But I'm sure that it's a good idea to to think through the practical and the logistical details of how you would actually respond.
00:40:20
Speaker
Yeah, absolutely. Yeah, I think and it doesn't really hopefully matter who the person is, it it should be like, there should be a clear process agnostic of the person. And I think the planning, was it plans are useless, but planning is essential. I think that's the slogan I always give to people like that, you know, like, it doesn't matter like exactly scenario, we might be wrong about the timeline. But if there exists a plan, if there's some draw and stuff, you know, the back end of a government office that just says this is what we do. ah ah Existing alone is hopefully and like going to like be more helpful than not existing.
00:40:48
Speaker
Yeah, I mean, we're talking about, we could call them low-level failures, but could it be that their a crisis arises and the government officials do not remember the plan that's been made, do not actually act out that plan? Do you think do you think that the plans we're making now would actually be useful when a scenario arises?
00:41:09
Speaker
Yeah, I think my honest opinion is it's a good chance it's not, but like it was not a mistake to have done this just for some probability that it was. And I think it really requires refreshing institutional memory that like this exists, this is a problem. I mean, I think one of the lowest-ignancy ways that the world ends is something like we had a plan, we knew like this risk was coming, we had a so maybe even a solution that would have like you know, kept this way down, but we just like some dumb error, someone forgot to email them some other plan, like someone, you know, not to read the plan. Like these are so basic errors that like, I foresee happening if we don't have such basic muscle of like what we'll be doing a crisis. Is there some process for briefing government officials, high level government officials on these risks? So say when a new, I don't know, a minister or prime minister comes into office, I'm i'm assuming they're briefed on on various issues.
00:41:58
Speaker
is there Is there a way to present them a package with with information on on AI risk? Yeah, I'm not too close to sort of the specifics. I understand that when the Prime Minister in the UK is elected, one of their first responsibilities is to decide like, you know, what would you do if there was nuclear weapons strike? Like that was one of the, within the first hour that they like come into the building, that is what they have to decide after, you know, after like 24 hours of that sleeping or something. So it's, yeah, interesting question. I don't know if like AI features and how it features in those early days, like of sort of the Prime Minister of the New Ministers. But yeah, my expectation is that there is some kind of briefing about like, here are the risks of AI, here's like responsibilities for them. But I'm unsure about the details on that.

International Cooperation and Dialogues

00:42:41
Speaker
So you work, you advise the the UK government, how do you think efforts on AI safety in the UK differ from efforts in the US?
00:42:50
Speaker
Yeah, so I think I mentioned earlier, but I think the UK is you know in some sense like a thought leader in this space, has certainly like taken this on as the first country in the world to set up an AI safety institute first in the world for an AI safety summit. And so I was really taking like that international sort of like lens and and really try to sort of bring these countries together um because like the truth is that like the US s and China are by far the most dominant competitively in terms of capabilities, but the UK has a leading role there as sort of massive power. So that's how it's like really differentiated itself.
00:43:19
Speaker
I think you can compare that to, say, the EU, for example, where like the AI acts was passed, covering, yeah yes, frontier, foundational AI models, but also yeah many applications of AI, covering a whole bunch of areas in a very large sort of you piece of legislation. Whereas the UK is not, to my knowledge, passed any kind of legislation specifically for AI, although there are relevant parts of other bills that might apply there.
00:43:41
Speaker
um And then in the US, there is the executive order that Joe Biden passed back in 2023, and also an AI Safety Institute set up as part of that. So I think, yeah, the UK is definitely led and there is certainly a kind of bipartisan consensus on the issue, but the US will wait to see what happens in November with the election.
00:43:58
Speaker
um of course with Kamala Harris, we should expect probably similar to, to Biden's executive orders like to stay with Trump in presidency. It's like very unclear what what happens with AI safety. There are people who are trying to open this issue, but it's still pretty unknown right now.
00:44:14
Speaker
Being slightly pessimistic on on safety work in the UK, you could say that since all of the ADI corporations are concentrated and in the US, concentrated in California, that's where that's where the the real action is going on. So something like the bill SB 1047 that's that's in in process right now is something that could actually change things and work in the UK will be less impactful. Do do you think that's that's an accurate picture of the situation or do you do think that that UK safety work is impacted. Yeah, I think it's a fair sort of question about like how useful the UK would be in this debate. Certainly, yeah, all the US, all the companies that are at the frontier with the exception of perhaps Deep Minds, of course, which is based in London. So yeah, I think like the US is ultimately of prime importance. But even then when it comes to like informing what the US does, one can look to what other countries are doing, including the UK. And so I think with the safety and security that have been set up, that was
00:45:09
Speaker
inspired by the UK. I do think like SB 1047 is going to be like particularly interesting both if it passes but also if that informs any future federal level regulation. So I think this' yeah it's interesting to see how like different countries, different states doing this differently. These are all ahead of the federal government and hopefully like yeah they'll see how the UK does, they'll see how California does, they'll see how the EU is doing things. and try and take the best from all those jurisdictions. But I think, yeah, to that accent extent, like differentiation and diversity can be useful just to like try and make way through regimes. Yeah, and I think that there's a real effect of governments taking inspiration from each other to going to international meeting meetings. And you know you you want your country to be as well prepared and as much on the forefront as, say, the UK. And so I do think there's an there's there's an effect there of
00:46:01
Speaker
the UK perhaps inspiring other countries' interaction? Yeah, I think there is some kind of unusual and maybe this is a cynical view, but perspective that like it doesn't matter what we're first in as long as it's something. And so that's probably true for many countries that they can call us as a world leader and the UK happens to be a world leader in safety, just give them like a broken circumstance. But yeah, I think like, the yeah, my cynical take is that countries are pretty happy to be a leader in something. um and And that just happens with the fullest way. and But hopefully that can also like inspire others.
00:46:31
Speaker
So you mentioned that you worked ah both under a conservative administration and under a labor administration. What are the differences in terms of right wing, left wing here? It doesn't seem obvious to me that there would be a difference, kind of a politically partisan difference, but I'm guessing that such difference actually does arise.
00:46:51
Speaker
Yeah, I think certainly at the high level, there's a lot of agreement that this is an issue that between conservatives and labor. I think the way to compare and contrast the approaches is that the conservatives set up the AI Safety Institute and did the summit, but their view was very much like regulation is too early, we need to test these models, we need to like be a ah leader in the space, but without necessarily going straight to hard yeah regulations. Whereas like the Labour Party committed to you know regulating its like most but powerful air battles, what that means in practice is like an open question, but is certainly yeah has committed to doing regulation in the way that the consensus is not. So I think ah maybe this like fits to some kind of ideological preferences around like the role of the state versus the free market, those kinds of questions. But
00:47:33
Speaker
I think despite that, though, there is a remarkable level of consensus, as I say, often the line is that we need to balance the risks, the opportunities, and that you can only get the opportunities to be at the risk. I think that's the line that both parties are pretty happy to sign on to.
00:47:46
Speaker
that's That's somewhat reassuring to hear. In a sense, you could predict that, ah say, a right-wing government would be ah less inclined to regulate a technology that that a left-wing government would be. But I think if if if we are talking about very advanced AI, say superintelligence, this is outside of the bounds of normal politics. And I think there there should be some ah ground for or agreement between both sides here. And it doesn't seem obvious to me that that it would be a particularly conservative position that we should not control superintelligence, say.
00:48:21
Speaker
Yeah, completely. Yeah, I completely agree. And it is, as I say, one of those few areas where we have like yeah remarkable agreement. Also in the US, I think the Republicans have said they're concerned about like AI safety, it's talking about like deep fakes and and and and maybe the way that those messages are communicated looks slightly different. So maybe Republicans complain about Silicon Valley elites and liberal elites, maybe um Democrats are more concerned about your issues of like bias and discrimination, but nonetheless,
00:48:46
Speaker
I think on the on the frontier level, there's a surprising level of agreement. And then let's say the US and China as well. And I think recent reports were that Xi Jinping himself was personally quite sort of at the existential side and was the same debate essentially as happening in the US where we have to balance that risk with trying to get ahead of each other. And so, and the one other thing is that we've we' thankfully saw US and China for the first time meeting on track one dialogues. I think it was on military AI issues, which is a real success.
00:49:14
Speaker
Yeah, perhaps perhaps explain for the audience what the difference between track one and track two dialogues. Yeah, absolutely. So Track 1 dialogues are you more formal briefings between the direct heads of state or parts of government between the US and China. And then we have Track 2 dialogues, which are more informal, not led by the state, but by the think tanks, by organizations around governments, but not inside government. And they can offer a more direct line of communication if Track 1 dialogues break down, and Track 2 still exists. And there can be this informal kind of cooperation that goes on.
00:49:47
Speaker
So on the issue of like the military AI founder's pledge, we've sort of talked about this previously, funded organizations working on this issue. um And so thankfully we've now seen track one dialogue. So US and China, the CCP meeting directly to discuss this. And I also tracked two dialogues happening between Chinese AI scientists and US AI scientists. So there is a glimmer of hope there that that can be ultimately a track one issue. But even if not, like the fact that there is dialogue is very, very promising in a time where the US and China are not you talking about many other issues.
00:50:17
Speaker
yeah and And maybe what is the value of having Track One dialogues going on? what is What is the symbolism there? And what what does it signal now to the administrations of both countries?
00:50:30
Speaker
Yeah, I think that can be an enormous power in just like having the meetings take place a tool in terms of like diplomatically, it means that there is like a shared agreement that this is an issue and it really kind of can change the debate. So if you think domestically in the US right now, a lot of people say, oh, we have to build these AI systems because China yeah will otherwise do it. But if you ever track one dialogue where Both countries say, you know, these are systems that are concerning, they are dangerous if we're built at the frontier, if we exceed certain capabilities, we'll agree to certain conditions, maybe stopping, maybe pausing, whatever the requirements are. If such a thing happens, then the whole debate in the US domestically around like we have to build this because otherwise China will kind of just be shut down because essentially there's like a federal level agreement that just won't be the case.
00:51:13
Speaker
So that's the real power of track one. And ah I think even then with track two dialogues, you can informally get some of the kind of commitments agreements that really sort of change the debate and also really like slow those racing dynamics. Yeah, i see I think even if you have track one dialogues, that there might still be the issue of trust, right? So we have we have official communication, perhaps friendly communications and and cooperative communication, but how do you then verify these good intentions are actually being implemented or are not being broken? I guess that's probably an and and unsolved problem, but i do you see track one dialogues helping in that direction?
00:51:50
Speaker
Yeah, I think certainly, yeah, if you can at least make the agreement at all, that's the first step. And then yeah the question of like, how do you verify such an agreement? So there are sort of like academic research papers on pre-verification methods and how you would be able to so actually check. For example, if the agreement was that you can't build certain AI systems, well, how do you actually make sure that like, you know, are they training them? Where are the compute clusters and and how do you know what they're being used for? There are sort of like questions in the air about how you do that. And like and alongside like the agreement to do it, you also need the research to actually figure out concretely what to do. But I often think that like you can solve the question of how to do it after you've got a question of what what we want to do it. um If the US and China are both motivated to solve this problem, then they can you know invest resource in it. They can think about like verification methods that will work. But that often is like requires the first step is an agreement that this matters at all.
00:52:41
Speaker
We might be somewhat lucky in terms of giant AI training runs being quite resource intensive and taken up perhaps a lot of space on the ground. Maybe they aren they are visible from from satellite imagery. and So it might be easier to verify whether such training runs are taking place compared to verifying nuclear weapons stocks, for example.
00:53:03
Speaker
Yeah, I think for as long as we have these like giant trading clusters, it certainly might be an option. I think one question is, like as the the cost of compute decreases, will they still be of the same size? And also, if it's like distribute distributed, sort of training across many different centers, then it might be hard. But even then, like maybe open source intelligence can you see if other spikes in energy usage in particular parts of countries are there. Are those kind like signs, that signals that suggest maybe trading is happening? And again, I think this gives some kind of indication for verification. but I think the ultimately what we need is more formal commitments agreements that maybe inspections may be kind of, ah or similar could be required. And that's also balanced that with the infrastructure security side of things, because of course, yeah we don't want people using their opportunity to sort of inspect to maybe like take AI level access.
00:53:49
Speaker
And another option here might be compute governance at the hardware level, where you could see you could you could call US export restrictions on advanced AI chips, a a beginning version of this or a embryonic version of this. But we can imagine a a scenario in which chips have certain restrictions on which training runs can be can be done using those chips.
00:54:14
Speaker
Absolutely. Yeah. And I think there's a really excellent report that came out a few months ago from the Center for Governance of AI on this question about, you know, the analogies between like, yeah, compute governance and like the governance of like uranium and nuclear weapons. And I think, yeah, they really do offer a note of governance. I think like still, it's more of a theoretical idea right now. It's not like a thing that, you know, the US is doing beyond those who say the chip sites, but yeah, I think in the future there could be, yeah, but especially if there's some kind of agreement there, that could be quite absolute.
00:54:41
Speaker
As I mentioned, your report focuses on how philanthropists can use their money to to make sure that that developing advanced AI goes well. One option you mentioned here is building out government capacity. This means equipping policymakers to build out the regulations required to meet this this ah kind of grand challenge.
00:55:02
Speaker
I do wonder whether we have historical examples of this actually working, whether we have examples of and some philanthropist sponsoring some, some buildup of of state capacity, essentially.
00:55:15
Speaker
Yeah, absolutely. I think the clearest case of this would be back in the 1990s, where sort of the Soviet Union had collapsed, and there's a real question about yeah what happens to nuclear weapons following that collapse. And philanthropists were the first to kind of like notice and pay attention to this problem, because if these nuclear weapons were you know not de-escalated properly with government yeah oversight, maybe it would fall into the hands of rogue states to to terrorists. So a real concern was there, and it was philanthropists who like first that attention. And then they really work closely with a government in Congress in the US, often like writing large parts of the the bill that eventually was passed as an act. So I think it's like a really strong case there that like, because of that philanthropic intervention, we've managed to avoid this problem of maybe like loose s nukes. Who knows how many site late lives that could have saved compared to if it had been an issue. So I think that's like a clear case in a nuclear space. In terms of
00:56:06
Speaker
AI, I think, it's yet to be seen. The AI Safety Institute is one example where philanthropists initially talked about AI safety as an issue. They got it on the agenda of politicians, and subsequently, governments have now invested in the issue. They've set up these institutes. So I think it's the the theory of change is harder to see there. It's not so strong a link, but I think philanthropists can raise the alarm, and then governments can answer by setting up sort of, yeah, such institutes.
00:56:32
Speaker
Perhaps in that vein, you also would talk about facilitating cooperation and this is cooperation both between leading AI corporations, but also between countries. why Why would facilitating coordination be important and why is that a good option for philanthropic effort?
00:56:49
Speaker
Yeah, absolutely. So I think this goes back to the the question on the racing dynamics and how we can prevent such a like an arms race on AI, both on the domestic and the international level. In terms of like the role for philanthropy there, I think the Track 2 dialogues are especially and promising. So they're ah a place where philanthropists really are the only ones who can step in because governments cannot fund Track 2 in the same way um because it would cloud the perception of the Track 2 in many respects.
00:57:13
Speaker
and private companies it's not going to. So it really is a space that only philanthropy can on the international side. And then also domestically, I think just like setting up forums, conferences, meetings between AI companies to really get them to come to agreements, voluntary commitments, and these kinds of things. There's a frontier model forum for a wheel board that's been set up. And I think these sort of like Yeah, establishments can be supported with philanthropy that, again, we shouldn't expect so private companies to do it. We've seen issues why governments won't do it. So I think that's the only sort of remaining option. And you you mentioned the in the report that most AI risk is driven by coordination failures.

Coordination Failures and Race Dynamics

00:57:51
Speaker
Why why is that the case? And you also you also write that these coordination failures are probably going to become worse over time. So yeah, maybe explain that.
00:58:02
Speaker
Absolutely. Yeah. So I think the first thing to note is like how frequent the coordination issues are in terms of the various risks we talk about. So in the power seeking AI case, like a key issue is that a company wants to deploy a model that may be unsafe because they're concerned about another party. So we have a coordination at issue there.
00:58:18
Speaker
um In the case of biological weapons, we often face the issues of many people um able to access such tools, and so there is like a ah widespread unilateral access, and in in military AI issues and and issues of democratic distributional risks. Again, like the key issue there is like not a single actor, but rather multiple actors who might be in contact with each other. Also, in the case of with housing AI in particular, as the cost of compute comes down, the cost of building these models decreases, we get more and more and more actors. so a key example here. So yeah maybe one company or two companies is going to build these systems safely, but we don't know if the tenth or the twentieth of a hundredth will. And so at some point, you know, we get to a coordination issue. And then finally, I think just like the number of people who want to have these catastrophes is actually extremely low. Thankfully, there there are very, very, very small number of people who want these issues to occur.
00:59:08
Speaker
So what that tells you is not that there is some unilateral actor you're acting like evil or recklessly but more often the case is that multiple people acting often in good faith end up causing many of these issues. So again and again and again I think coordination is one of the key sources of risks.
00:59:24
Speaker
And the reasoning there would be others might act irresponsibly, so we have to race to get to advanced AI first such that we can prevent them from acting irresponsibly. And then everyone is is racing, even if everyone is is well-intentioned and you know actually takes these risks seriously.
00:59:44
Speaker
Exactly, yeah. And I think it's a real tragedy that this happens and we've already seen this. I think it's also, there's a, ultimately it's a flawed logic. I think there are decision-theoretic reasons to suspect this is like a bad cause of actions because they're reasoning in and a poor way about how others will act. And I think there are solutions that they could be pursuing in terms of the other decision theories they could apply. I also think that just like If we look at like the Manhattan Project, for example, which it's it's a similar case where but we had to supposedly build this weapon because maybe Germany was doing it, and then that was no longer the case. Nobody had to do it for Japan. And then Russia and Soviet Union was then a threat, but there was always some kind of imagined enemy. And I think you know the vast majority of physicists there were actually in good intent, but ultimately they took this weapon being created. There were a few rare examples of people who said they wanted to step away because even then they thought the logic was new.
01:00:33
Speaker
flaws. And so I think when we see people, ah for example, open AI taking a similar path, we're saying, you know, even though I think that these risks are real, I cannot be part of a company that sort of pursues this. So yeah, I think that's like a real sort of shining example of what to do. That's that's interesting to me, because it's it seems like if you believe that open AI is a head say, and they're developing advanced AI,
01:00:56
Speaker
Maybe even if you even if you disagree with the direction the company is going and you should try to stay at the company, say if you're working on safety and try to to do the best you can, that's at least an argument that I have heard, but you you have some some reason for for disagreeing with that.
01:01:12
Speaker
Yeah, I think i have it is a debate and it's often discussed. I think I do come down on the view that, yeah, I think like often in practice, this is just really hard to do well. I think if one goes into these companies thinking that they're going to change them from the inside, I think this is just like yeah really difficult, maybe ineffective to do. You're working against the strong incentives, which is not a position you want to be in.
01:01:33
Speaker
Exactly, exactly. If your job and your livelihood is on the line, it's very, very hard to sign up. And I think maybe there's a consequentialist type of reasoning that says maybe you can change it, but also I think some non-consequentialist values about integrity and sort of like having some red lines that you would not cross is just really, really important. um Just like from a basic moral perspective, even if you think that the the logic holds out ah on on paper, you have to have some line somewhere.
01:02:00
Speaker
I do worry, though, that what you then end up with is only the people who are least concerned about safety working at the most advanced corporations. Yeah, I think that is certainly a risk. And one has a trying way up whether that's like worse than leaving, but I think, yeah, firstly, the integrity point. But then secondly, I think if you can really make this like publicly known that this is a risk, then even people who are less concerned and might start to be nervous and there might be some kind of knock on effects that it's hard to see in a moment. But I think, yeah, that often there will be a spectrum where people who are most concerned might leave first, and then the next group of people might be slightly more concerned. And so, on and yeah, I think with open AI, there has been a lot of people who have resigned who are not all motivated just by safety, and you know, very strongly, but even who are sympathetic, who are now more nervous. So, yeah, I think there's a lesson there.
01:02:48
Speaker
Let's try to step back a moment and then ask, like what does success look like here? Because what we've thought we've talked about a bunch of challenges that we face. What is it what does it mean, for example, to say, okay, we have solved these problems to our satisfaction and we should now and we should now try to actually develop and deploy advanced AI.
01:03:10
Speaker
Yeah, it is a good point. And i' I'm glad for the question, I think, but you spend too long on the risk side of things, and you're like, remembering like, what is this all for? And yeah, I think ultimately, yeah, it's where like, always, if we succeed, what happens and you know, like, poverty is a massive issue, right? um Inequality, you know, destruction of the environment, climate change, all these issues, I often these like can be solved with, cloud with the AI systems if they're sufficiently smart. But I think that's like, just the lower bounds. Of course, if you think about the long term future, if you're interested in sort of like,
01:03:38
Speaker
seeing humanity really thriving as a civilization across the galaxy. um if the the The potential was just astronomical. And I think I'm fundamentally of the view that that would be a really positive thing. And that's why it's so so tragic to to see like our attempts to try and get there just a little bit faster coming at the cost of these risks. yeah but There is no need to accelerate because we will get to the stars eventually. It's kind of my view. We really will like yeah colonize the galaxies and really have flourishing human societies. And that is like a real thing to look forward to. But we can't do that if we go extinct. And that is a really basic point. And I i am really frustrated by people who are on the acceleration side who think that they are like the optimistic people in the room. They are missing the point that we all want the same thing. um we're just like yeah we can We can wait a few more years. you know like that The galaxies are going to be around for trillions. So we can we can wait a little bit longer.
01:04:27
Speaker
One other approach did that you mentioned in your report is that you could try to boost a global resilience, where this means preparing for for kind of worst case scenarios.

Resilience and Defensive Strategies

01:04:38
Speaker
You have some examples of this. One is contingency planning. Maybe we discussed this when we talk about war games and and planning and in government in general, but which institutions should do contingency planning? is this yeah how how How would you implement such planning?
01:04:56
Speaker
Yeah, absolutely. So I think governments are the key there that, you know, federal central government, UK government, others should be the core there. Also, companies should have clear protocols and practices for escalating that information. So I think that's like, yeah, contingency planning, crisis planning is like one core component of this. I think like other parts of this resilience piece is also just like having mechanisms for like shutting down these systems, you know, having it really easy, both to literally pull the plug, but yeah more broadly, having ways to really sort of rein in these systems. And I think like the final part of this is thinking about like ways that AI systems might try and take over and really closing off those loopholes. So take the mindset of like an AI system you trying to disempower humanity. What are some of your possible pathways? You might try and engineer a pandemic. You might try and launch a massive cyber tech.
01:05:43
Speaker
There are many other pathways, but what we could do is try and reduce that chance now, which really sort of like closes that loophole, which makes it harder for an AI system to take over. And by building up those offenses, you're really sort of making us more resilient. um And some people might say, well, eventually their system is going to be smarter than you anyway, so it's going to exceed those barriers. But at the very least, we can buy some time by ah putting off these kind of like open threat models.
01:06:08
Speaker
You mentioned the shutdown mechanisms. There, I guess the worry is that the cure might be might be worse than the disease. And if you if you build up a a way to shut down AI development globally, what would be required to build such such a system would be extremely dangerous to create. And I hear I'm thinking of kind of authoritarian control over technology. Yeah, how do we balance those risks?
01:06:34
Speaker
Yeah, I think it requires one to be like very clear about exactly when you would and would not use them and like on the war conditions, and also like when you would no longer apply such rules. like maybe yeah I think at one real concern in the past with things like terrorism, for example, is we say we're going to bring in some kind of like safety measure to like prevent you.
01:06:54
Speaker
ah future attacks and then such like sort of authoritarian policies never get removed. So I think like being really clear on the conditions in which you'd apply such a break, a pause, a shutdown is really important. I think the other thing is that these measures don't necessarily have to be highly costly. They can be simply for only companies that really build systems that are both over a certain size of capability or in compute. And also in terms of you showing clear, like dangerous risks, existential risks, only then, and that is a really small minority of your companies will be affected, then you just have to pull the plug. And you can be very clear that like, you know, this doesn't prevent them from future developing models. You just have to pass a certain bar and they have to make sure that they're safe. But so I think the the number of cases where you use this is really quite small and hopefully that can hold off the threat that we're
01:07:39
Speaker
you know, having government surveillance. I think that the real challenge is of course, as the cost of building the systems comes down, more companies come into this sort of this zone of regulation, and then you have to be more active. um My hope is that we can build safe systems that avoid that, but we will see.
01:07:56
Speaker
You also write about ah psychological defense. this ah This is actually an interesting framing in my view. What I see it as is is as AI becomes more advanced, we will become more and more influenced by it. We see and like an early version of that with social media now, which is already driving some people insane, I would say. And so we can imagine the AI powered version of that. What could such psychological defense, what would it consist of?
01:08:22
Speaker
Yeah, absolutely. I think we see like with anthropic, for example, they like do evaluations for their AI models on like persuasion and manipulation and seeing if they are capable of doing that one solution and this is not like well thought through, but you could try and test people to see how easy like twenty they are by AI models. they Maybe they are more easy to manipulate or harder to manipulate and you can yeah try and like develop mechanisms for for really testing like and people's logical ability to you be manipulated by these systems or not. And you want the people who are least manipulatable to be in charge of like keeping the system safe. and And the reason for that is, again, like you might say, well, eventually the systems will be more smart than humans. They can manipulate any human being. But before they can manipulate every human being, they have to replace any son. And if you can at least hold off until it makes... Basically, you're making it really expensive.
01:09:10
Speaker
these AI systems to take over. It's not just a case that you can you get into a wet lab and build some pandemic yeah ah pathogen and then release it. You really have to also manipulate people and these people are going to be really strong in terms of like resistance to that manipulation. You also have to you navigate computer systems that are also really quite secure.
01:09:27
Speaker
and all these kinds of defenses each of them may fail but you want to sort of have a defense in depth strategy where you are putting all these different kinds of defenses against a system psychological personal cyber physical and you're hoping that like something will will it succeed and of course like who knows like we may like be really bad at stopping cyber attacks from AI systems but maybe we can still be good at psychological defenses so I think as many we can build is the better and these psychological defenses would be useful in a situation say where an AI system is trying to convince someone to plug in a server to the internet or plug this USB drive into and sort of this and that computer or or you know go make this synthetic virus for me. And and so you would you would do some form of screening to make sure that only very stable and non persuadable individuals are in positions where they have to and i contain these systems.
01:10:19
Speaker
Exactly. Yeah. Yeah. I think that's the sort of idea I have in mind. I mean, as I say, it's not something we've like sort of thought deeply about. I think there's not been much research on this area yet, but I think it's probably forced into standard interest security work as well. And just yeah having people who are like, I'm not going to give away information. You make sure that like it's secure and contains.
01:10:37
Speaker
it should be something that I focus on. And again, like I say, it could be all nonsense. It could all be silly doing this, but like, it seems worth, you know, throwing whatever sticks at this kind of problem and as many different layers of security, which are different layers that already sort of preventing threats going through. I do think longer to it might be difficult because there will be commercial incentives to develop systems that are a very good at persuasion. I think one of the first commercial applications ah of large language models is in sales and and kind of customer service and there I think being able to persuade people this is a real asset. So I do think these systems will be pretty good at it. You can imagine then the
01:11:13
Speaker
the video video call version with a generated person that's even more persuadable. There are kind of buttons you can push for humans where our psychological defenses break down.
01:11:25
Speaker
and Yeah, I think certainly, yeah, I wouldn't think this is like a fail safe at all. I think probably at some level, superhuman systems could manipulate basically any human being. But I think, yeah, just giving that extra kind of like layer. And if if you just evaluate how persuadable AI systems can be to humans, if you just, you know, do tests over time and see it with future models, are they getting increasing persuasive? You know, is it 20% or 50% or 80% of people who are then believing sort of things AI systems create their faults? Like that's a basic evaluation you can run to at least so inform it Yeah, I wonder if we can use AI to defend against AI in in this particular ah scenario. So I'm imagining a and model that is this trying to to preserve your sanity, trying to to make sure you are you're making decisions that are aligned with your previously stated intentions and motives. and
01:12:18
Speaker
You can imagine such a system pinging you, talking to you, you know, should you really plug in this USB? Should you really send these instructions for creating a synthetic virus to this biological lab and so on? Yeah, do you think do you think we could we could use AI to to to defend against AI in this psychological sense?
01:12:36
Speaker
I think it's certainly a possibility. I don't know how much thinking is going into that. But yeah, if there are AI systems we know that are able to persuade people, we may also know how to prevent them from doing so as well. so And I guess like one challenge might be that you might have weaker systems trying to defend against stronger systems. um And so you'd have to try and figure out if there's some way to prevent that. But if there's a specialized system just in maybe psychological defenses, perhaps.
01:12:59
Speaker
that could get some more generally capable, stronger intelligence. So yeah, I think there's like a lot of interesting questions there. And I think it's one that I haven't really seen much thinking about. So it'd be really interesting to see what solutions could come.

Philanthropy and Long-term AI Challenges

01:13:10
Speaker
You manage what is called the patient philanthropy fund. What is this fund and and what's the what's the idea behind it?
01:13:17
Speaker
Yeah, absolutely. Yeah, so I guess like we've talked a lot about like AI and the risks it poses right now or maybe in the next three to five years. So I think the patient for three funds is taking quite a different perspective. So this is an approach to the philosophy that's come about the last few years, which basically says that you also have the most impact. It might not be necessarily best to spend all our resources, all our time and money on these threats today, but rather save for the future, invest for the future.
01:13:43
Speaker
And so you can think in the AI case, if we see AI systems that are only capable of these like risks in 50 years or 100 years, then it will be a real mistake to have spent all our money now and then only be left with nothing for those those futures. So the core idea, essentially, is that by saving money, we can compound our interest on that. And so if you put $1,000 in year-to-day, 7% return a year, maybe that's like two times as much in 10 years, that's about 10 times as much in 35 years. There's this example from Benjamin Franklin back in the 1790s, who put $1,000, I think it was, into a Boston and into Philadelphia. um And by the 1990s, when that money was finally spent, and it was, you know, in the millions of dollars. So there's like a real sort of the historic example here, where you can actually have quite a large impact. And so
01:14:31
Speaker
That's where the idea came from. And in Founders Pledge, we set up this fund in 2021, based on that premise. and And it's been growing now, and we're at, I think, two and a half million dollars, approximately. And already, ah yeah, $500,000 of that, approximately, is... um you extra money that we would not have had available if we'd spent the money two, three years ago. So merely by being patient, merely by sort saving that money, we now have extra capital to deploy, you know, AI risks, biological risks, nuclear risks that we would not have been able to do. And so yes, it's very early days. Hopefully this fund will maybe run for centuries, but I think we're starting to see the the benefits.
01:15:05
Speaker
yeah so You mentioned that you might regret having spent all your money early. and The other side of that coin, as as I'm sure you're aware, is that you might regret having a bunch of money if you've lost control ah of of the future because because AI has taken over. right And and if if if this is really a crucial time to deploy capital,
01:15:25
Speaker
How do you make that decision? I know this is a massive question, but it's interesting and not only for the patient philanthropy fund, but for basically all philanthropists. How should Bill Gates and Warren Buffett and Mark Zuckerberg deploy their capital in in in terms of expectations of technological progress and trying to predict how crucial the current moment is?
01:15:48
Speaker
Yeah, absolutely. And as you say, it is a massive, massive question. I think my average is, you know, I'm very uncertain about this question. You have to put some ingredients in these AI systems being built in the next five years. I also think it could be 50 plus years and not just AI, but also nuclear threats, biological threats. We don't know if war is going to get worse or better, but like given that uncertainty, I think it like makes sense to hedge across a bunch of those different worlds. So, you know, I put some like funding and resource into it like threats today, some into your next 10 to 30 years, maybe 50 plus years. And I think just having that diversified folio is like quite important. The patient fund itself is a very small fraction of the total amount of money in this like AI and GCR space, um but it is also the only fund focused on this. So even if you're only putting aside a couple of percent towards 50 or 100 years or beyond, that is like still probably worth doing, it having literally zero percent.
01:16:39
Speaker
and I think there is no argument in principle I wouldn't deploy much sooner if the time was right. um If in the next 10 years we really were in a crisis it would deploy then, they would just have to meet up very high bar for giving out all that funding. So you would have to see all these other major philanthropists also donating a bunch or it also makes sense to add in that extra capital.
01:16:57
Speaker
You could talk about a sort of compounding on on the in in terms of impact also. Say you're you're funding the UK AI Safety Institute, you you funded that today, and the efforts there compound over time, and so the impact compound over time. How how would you compare the the compounding, so to speak, that's happening on on the and the impact side versus kind of pure and monetary compounding?
01:17:23
Speaker
Yeah, absolutely. um Yeah, I think ultimately we care about the impact side of things, of course. And so if financial returns are one component, but there are others. So for example, learning gains. So you can maybe you spend your money today on the best, your best guess, but maybe in one, two, five, 10 years later, you'll have more wisdom or foresight to think actually the almost. most yeah more promising opportunities with AI. That seems to be the case where you know previously we didn't really know what we were doing, and now I think there are much more clearer policies and interventions to funding. That seems only possible because of those learning gains. So I don't know like exactly. Playing a number on that is very hard. like it's not and We're learning 5% more a year, 10% more a year, but I think there's some kind of learning rate we should account for there, which like
01:18:04
Speaker
adds to the case for being patient. Even if you think these risks are in the not too distant future, you at least maybe spend yeah five, 10 years more thinking about where's best to give and then giving rather than today.

Recommendations and Resources

01:18:14
Speaker
As a final and perhaps less serious question, are there any books or concepts or say podcasts that you would recommend to listeners? ah some Perhaps something that's that's that's fairly small and and and not well known.
01:18:29
Speaker
Yeah, it's a good question. My colleague, Christian Ruhl, who came on the podcast previously, is recommending this book to me many, many times. And I am ah recommend it again now, but the three-body problem, I think, is, yeah, I'm a big fan of that. I've just been getting into the trilogy and I think, yeah, I think it has some interesting overlaps of questions here and like, yeah, future intelligences and aliens and interesting sort of implications there. So I think, yeah, I'm not a massive sci-fi fan, but that one's I really enjoyed and recently. And there was also a recent Netflix show, which I wasn't as a big fan of, but worth checking out. That's the fiction side. Do you also have a nonfiction recommendation, perhaps? um I recently read a book about, so I would say, like, semi-autobiographical about John von Neumann, which is very interesting. It's about, yeah, like, intelligence and history weaved into analogies about AI. So I thought it was like a very interesting sort of reason. I would recommend that. i The name escapes me, so I'll have to find that.
01:19:22
Speaker
i'm I'm guessing it was the one written by Bhattachara. The author's name is Bhattachara. I might actually have it in my bookshelf behind me. But i do as for listeners, I do recommend reading about Von Neumann, who is a truly extraordinary person. Absolutely. Yeah. All right, Tom. Thanks for coming on the podcast. It's ah it's been a real pleasure. Thank you so much for having me.