Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
AGI Security: How We Defend the Future (with Esben Kran) image

AGI Security: How We Defend the Future (with Esben Kran)

Future of Life Institute Podcast
Avatar
3k Plays1 day ago

Esben Kran joins the podcast to discuss why securing AGI requires more than traditional cybersecurity, exploring new attack surfaces, adaptive malware, and the societal shifts needed for resilient defenses. We cover protocols for safe agent communication, oversight without surveillance, and distributed safety models across companies and governments.   

Learn more about Esben's work at: https://blog.kran.ai  

00:00 – Intro and preview 

01:13 – AGI security vs traditional cybersecurity 

02:36 – Rebuilding societal infrastructure for embedded security 

03:33 – Sentware: adaptive, self-improving malware 

04:59 – New attack surfaces 

05:38 – Social media as misaligned AI 

06:46 – Personal vs societal defenses 

09:13 – Why private companies underinvest in security 

13:01 – Security as the foundation for any AI deployment 

14:15 – Oversight without a surveillance state 

17:19 – Protocols for safe agent communication 

20:25 – The expensive internet hypothesis 

23:30 – Distributed safety for companies and governments 

28:20 – Cloudflare’s “agent labyrinth” example 

31:08 – Positive vision for distributed security 

33:49 – Human value when labor is automated 

41:19 – Encoding law for machines: contracts and enforcement 

44:36 – DarkBench: detecting manipulative LLM behavior 

55:22 – The AGI endgame: default path vs designed future 

57:37 – Powerful tool AI 

01:09:55 – Fast takeoff risk 

01:16:09 – Realistic optimism

Recommended
Transcript
00:00:00
Speaker
Absolutely.

Security as a Foundation for AI Use

00:00:01
Speaker
Foundational to AIs being useful and functional for use in society is the fact that they are secure. If they are not secure, it's ridiculous to to even try to use them for any commercial activity, any national security activity or anything else. I think something that is actually worse and more dangerous than AI risk right now And this you know might come as a surprise. It is the like cult of inevitability. Like, okay, it's inevitable that we won't be able to convince politicians, but you never talk with any politicians.
00:00:30
Speaker
Like, come on. and And the same now of like, oh, we cannot pause. We cannot create this distributed security system and so and so forth. There's like two ways you can be optimistic.
00:00:41
Speaker
One is by not realizing the danger. Another one is by deluding yourself. And then there's a third option here, which is actually see what is the realistic version of of what's going to happen over the next years and how can we shape that ourselves.

Introduction to the Podcast and Key Themes

00:00:57
Speaker
Welcome to the Future of Life Institute podcast. My name is Gus Docker, and I'm here with Espen Kran, who is the co-director of Apart Research. Espen, welcome to the podcast. Thank you so much, Gus.
00:01:08
Speaker
Fantastic. All right.

Comparing AGI and Cybersecurity Needs

00:01:10
Speaker
You have a wonderful essay on AGI security and what's needed for security in a world with AGI. So first, maybe sketch out for us what's what's the difference between cybersecurity, as we know today, and defending against Agile-level threats?
00:01:28
Speaker
Well, largely, i think people and listeners of this podcast will get what I mean when I say ai risk and the potential risks that come from introducing general intelligence. In my view as well, this comes with a lot of implications for security and the defenses we need to put in place in institutions, in the systems, and in the algorithms themselves that run our society in various ways.
00:01:50
Speaker
And specifically, we are talking about this new paradigm where everyone's familiar with cybersecurity. It's this layer you put on top or this thing you install in your firmware that someone else runs or that you run or you have a team inside your company ah running.
00:02:05
Speaker
While this new paradigm is really about embedding it into the foundations of every single thing we're building. And in my view, it it's also a question of rebuilding our societal infrastructure. It's about rebuilding it in a way that is compatible with introducing both very, very comprehensive and advanced tool AI.
00:02:25
Speaker
and also controlling and monitoring various general intelligences. So that's really the the foundation of that. It's about rebuilding the societal infrastructure in a new way that's more sustainable to this, and also realizing the many new attack vectors that come with AI.

Advanced Threats and Future Malware

00:02:45
Speaker
Yeah, I think a good way to show what we're talking about here is to discuss Sendware, which is something quite different from a traditional virus. So maybe... Maybe explain what scentware means.
00:02:56
Speaker
Yeah, so the concept of scentware was something i formulated in an earlier explanation or earlier exploration of this topic, which is really about how we've seen all these various viruses and malware and Trojan horses, etc., come in and do a lot of damage. So I think it's been over $100 billion dollars for something like Nutpetia and some of the other ah viruses, malware that have been released that are dumb malware. It's dumb viruses that replicate and and propagate through computer networks through extremely simple programming.
00:03:30
Speaker
Now, what we will see in the future and the natural evolution of these is that it will be so easy to give these viruses and this malware sentience in one way or another.
00:03:42
Speaker
And you can say if it's sentient or not, it doesn't really matter. Behaviorally, it's about whether it self-improves, whether it can manipulate users and and humans, people on the system, while it it also improves its ability to do cyber offense.
00:03:56
Speaker
So this is just one example where the cybersecurity itself is very much at risk, where sentware is then this like sentient malware, as as I called it in the post itself. So what we're talking about here is software that is adapting to new situations, trying different options that is more much more flexible than what we know from traditional viruses, and therefore much more difficult for us to and to defend against.

Infrastructure and Societal Defense

00:04:24
Speaker
So what is it that we need to defend against these types of threats? if Could you sketch out the different parts of the infrastructure that we're interested in in building on? Yeah, so I think largely we're looking at at every single level needing a new type of defense.
00:04:41
Speaker
Like it's going to be on the societal level. It's going to be on the individual level. It's going to be everything in between. It's also going to be completely new attack surfaces, as it's called in cybersecurity.
00:04:52
Speaker
These new ways that we can now be compromised. So this includes cognitive, it includes information stream control, and these ways that companies that run the AIs today can actually control what information you have.
00:05:08
Speaker
It's the types of cognitive attacks that you see when you become very infatuated with a potential AI girlfriend or boyfriend. And it's also the manipulation of our democracies and society beyond just all of the individual cybersecurity concerns.

Lessons from Social Media Algorithms

00:05:24
Speaker
So a kind of extreme version of what we see today with social media algorithms or is that taken to the max? Exactly. I think very recently, Sam Altman released a post on mentioning that social media algorithms were the first misaligned AI.
00:05:41
Speaker
And it's a great exercise because I think society has woken up to the fact that that this is the case and has then engaged quite deeply with this topic before we've gotten to this point with AI as well.
00:05:53
Speaker
And hopefully that transfers over too. Of course, we're not just talking about like personal and democratic security. We're also talking about the potential risks as as Rand have previously reported in in some of their work on and listing out what are the national security risks of AI as well.

Broader Implications of AI Risks

00:06:09
Speaker
So we're not just constrained to the personal and societal level. We're also constrained to like the the the the personal security and the actual ah security of all citizens ah across the world.
00:06:20
Speaker
in terms of individual actors getting access to ah to the ability to create bioweapons, the ability for models themselves, for example, very, very complex versions of sandware to compromise democracies and potential financial markets, and and much more like this.
00:06:36
Speaker
And there's a lot of literature on the web on on this, of course, so I recommend people read it if they're out there. Do you foresee the solutions here being being personal or being implemented at the societal level?
00:06:50
Speaker
So of course, it would be optimal if we could implement this at ah in ah in a universal sense. Do you think it's possible to defend yourself personally against this level of threats as we might today defend against various they're kind of cybersecurity threats.
00:07:09
Speaker
You know people can adapt ah security practices that can protect themselves. yes ah Is it possible to be to to be personally protected from AGI-level threats?
00:07:20
Speaker
So today, it is probably not. There is a way right now where I've talked with a few founders and engineers that are working on on methods to democratize personal security as well.
00:07:33
Speaker
So in this, you can purchase a service, security as a service fundamentally, where you get the same level of security as someone someone who's very rich and might might have like five people maintaining their personal security or Elon Musk having a group of bodyguards something like this, but on the on in this way that that it utilizes the digital intelligences to actually make that much cheaper.
00:07:57
Speaker
So we are not scared, I think, as well, to actually use the technology itself for our security. And this is where personal security can become much cheaper too. I think it is we have to be realistic that this only covers a few of the potential issues that I mentioned and potential threats, and that a lot of it will be on societal scale level.
00:08:17
Speaker
So we're talking, ah for example, the the energy infrastructure build out right now, the different data centers and compute centers that are being built. They need a completely different foundation for security than they've had before.
00:08:29
Speaker
It's this, you know, we we talk about it in terms of safety levels. So SL3, SL4, SL5, how safe does it need to be? And as already at SL3, you want it to be secure against national adversaries.

Data Center Security and Systemic Threats

00:08:42
Speaker
Whereas SL5 is this new concept coming up now, which I think it's inevitable that every single data center that runs any type of frontier AI needs to be SL5.
00:08:53
Speaker
And so that is definitely societal scale. You've you've seen the size of of the $500 billion dollars investment in Stargate, for example, being a good case for it what is the type of investment that's also needed for for a similar level of se security.
00:09:07
Speaker
Yeah, and I guess there's ah there's a limit to how protected you can be personally if society is crumbling all around you or if you if you have kind of widespread widespread chaos. And so at the at a certain point, we must admit that we need to solve these these problems collectively.
00:09:24
Speaker
But maybe sketch out, so you mentioned protecting yourself from protecting yourself cognitively and protecting your, I guess you called it information stream. So not being manipulated by language models, for example, in today's case.
00:09:40
Speaker
But ah what are other categories where we need protection? So this is, of course, we need protection on the cyber level as well. We also need like physical protection. so So when you when you talk about the data centers, for example, you talk about building fences and, you know, having proper monitoring. When you talk about that in the individual case, there's also a series of technologies that needs to exist.
00:10:03
Speaker
I think much of it will also look algorithmic in nature. I usually have this way of framing it where you don't want to, you know, put it on, patch it on afterwards. You don't want to patch on AGI security afterwards.
00:10:17
Speaker
You want to to make it foundational. You want to put it into everything. And so I think, and and this is like a bit of an an interesting way to to to look at it, I have this vision where every you know middle school child knows about fundamental cryptography because they have to, because they have to know like which fiber optic cables running under the ocean are the ones we can trust, which ones are compromised now.
00:10:40
Speaker
How can we separate various parts of the internet very quickly to make sure that that compromised areas are shut down within milliseconds through a decentralized monitoring and like verification scheme? These kind of things, I think,
00:10:52
Speaker
and need to be imagined. and And there's a lot in there of pure innovation that that needs to understand what what do all the threats actually look like and then provide realistic realistic defenses against it.
00:11:05
Speaker
And realistic means very comprehensive and potentially societal scale, as I mentioned. And probably redundant too.

Private Investment vs. Public Security

00:11:14
Speaker
So different layers of security where if one fails, do you have the other to kind of make sure that the entire system doesn't collapse.
00:11:23
Speaker
These seem like government level issues where you would need government level effort and perhaps government level funding to solve solve these problems. But at the moment, at least, much of the funding is private.
00:11:38
Speaker
Do you think the private funding can scale to meet the demand, to to meet the level of security we need? I think the big problem with private investment is mostly about its incentives and not its size.
00:11:51
Speaker
the The fun fact here is that Stargate as a project or the yearly investment into data centers is an order of magnitude or more in size of the whole Apollo program.
00:12:04
Speaker
And so this is just like it's massive investments and it's investments at the scale of society. There's this whole point that that media today and journalists are are very, very number blind because 500 billion, oh yeah, that's the same as 500 million.
00:12:19
Speaker
No, no, no. It is much, much more, right? This is like an absolutely insane level of investment that has seen no precedence in history. And so we we are already at the stage where private companies are competing with the government in terms of pure numbers and pure size.
00:12:37
Speaker
And there's other issues with this, but that's beyond the scope of this chat. And specifically, the problem is that their incentives are to both create stronger AI and to create inference compute.
00:12:50
Speaker
It is not to make the data center secure, and it is not to make the AI secure. That is not what they will earn money on. Of course, I will repeat again and again that absolutely foundational to AIs being useful and functional for use in society is the fact that they are secure.
00:13:09
Speaker
If they are not secure, it's ridiculous to to even try to use them for any commercial activity, any national security activity or anything else. And it would only happen as a result of a race, which is also why I think like this needs to be ah a very, very cross-border issue and a very ah large international issue.
00:13:27
Speaker
negotiation question as well. There is the issue of when we try to monitor AI development, when we try to understand what's going on to gather information, we risk we risk creating a system of surveillance, especially when we involve governments in this project.
00:13:48
Speaker
And so... Part of AGI security, i think you write about, involves AGI privacy also. So perhaps let's start with the question of how is it that the AI oversight practices that we are interested in implementing today might lead to a form of surveillance that we're not interested in?

Surveillance and Trust in Security Systems

00:14:13
Speaker
Yeah, I think many of the governance proposals are, of course, taking the risk at face hand and thinking realistically, which means that you have to implement very, very strong controls to make sure that people can't misuse the models for, for example, creating bioweapons or for manufacturing explosives or various dangerous things like this.
00:14:35
Speaker
This also means that we're at risk of creating an even stronger surveillance state because now you had the the post 9-11 justification for for surveillance being that there's more terrorists.
00:14:49
Speaker
Now there's fewer terrorists, but they have much more power individually. So you need much stronger surveillance and much deeper surveillance of every single individual if you actually want to have proper governance.
00:15:01
Speaker
the The opportunity we have right now and during the next couple years is that the supply chains of AI and the released AI is constrained enough and focused enough that it's a few actors that we need to convince and work with to create a system where we can avoid a surveillance state.
00:15:20
Speaker
The classic example is that every single chip comes out of this one factory in Taiwan, TSMC, and now a couple more, but still the exact same people. And all the machines that TSMC uses comes from one factory in the Netherlands, ASML.
00:15:36
Speaker
And this is the the classic, like, yeah, there are two companies that are foundational to this industry. And of course, China is now developing alternatives here, but they're still a bit behind. And so we can still discuss and and try to constrain this before there's too many, for example, open source models that are extremely capable on the market or just on hugging face or similar that then create much more powerful non-state actors and where they can run it on their GPU, you know they can just own 10 of those.
00:16:06
Speaker
And suddenly you do need a surveillance state to actually make sure that the the risks are constrained. And of course, but we don't want this because there are so many issues with surveillance states that come up from this as well.
00:16:19
Speaker
Yeah. And what's the alternative of then? if If we're really ah racing towards a world in which individuals are empowered to impose massive risks on on basically everyone,
00:16:31
Speaker
We could talk about making explosives or creating engineered pandemics, or there are many examples of what might go wrong here. If we're racing towards that world, what is the alternative to kind of intense surveillance in a centralized fashion?
00:16:47
Speaker
So you you hit the nail on the head there. It's in a centralized fashion if it's a surveillance state. And so you need a lot of trust in the central actor and the central actor needs to be extremely competent.
00:16:58
Speaker
And there's this single point of failure effect that you need to mitigate. And so what's what's the non-centralized version? It is a decentralized version. The classic story here that I like a lot is the story of the Internet, the early story of encryption, where today, whenever you go on a website, you have your HTTPS connection, which the S stands for secure, right?
00:17:20
Speaker
And it is this connection that makes sure that your information, when you transfer it over, is encrypted and that no one can read it on its way towards the server. this is very This is very unique. This is a weird thing that governments allowed this.
00:17:35
Speaker
And the early story of this is basically that the government wanted to stop it, wanted to make it illegal to use encryption because they wanted the centralized control and surveillance of all the communications because they thought, you know, if you have something to hide, then you're a criminal function.
00:17:50
Speaker
And the banks were then like, hey, we need to secure our financial transactions. And this like very, very strong financial player then created the incentives and the lobbying power to implement all these open source algorithms for encryption.
00:18:06
Speaker
I think it's the same we'll see now that unless we want non-functional AI that we cannot use, we need to design these new types of secure transmission systems, these new types of decentralized verification, serial knowledge proofs of of capabilities, various capability based constraints, inference time verification, and and so much more.
00:18:28
Speaker
to actually secure this stack where we are every single one of our computers today are part of the security of the web. Whereas in the case where it's centralized control, it is one server on government grounds that is responsible for all security.
00:18:43
Speaker
This does not seem sustainable. We've, through this distributed network and peer-to-peer network of of the World Wide Web, we've created something extremely unique. And I think we can repeat that success. And some of the Web3 work is, of course, a great exploration of this, and you can critique it however much you want. But some of it is actually relatively good for these types of digital contracts between agents and anonymous peer-to-peer interaction and data management and like verifiable transactions.
00:19:12
Speaker
And so you can imagine that going much, much deeper now and and at a much faster rate than we would all otherwise anticipate. I mean, you you talk about the internet and but ah for me, it's kind of an open question how secure the internet is, how resilient the internet is.
00:19:28
Speaker
how do you How do you see this? Do you think the internet, what is it is it coherent to talk about taking the internet down or the internet going down in a way, given that it's a distributed network?

Distributed Safety and Open-Source AI

00:19:40
Speaker
And also, isn't it isn't the internet basically, the the kind of internet stack is very janky and ands it's like complex and you know it's maintained by random people all over the world. So is this the model we we want for...
00:20:01
Speaker
controlling or steering AGI. Because you you can imagine people listening to this and thinking what we actually need is kind of strict government control. We need something like the Manhattan Project. We need this to be secret.
00:20:14
Speaker
We need this to be centralized. We need to control information in and out. Yeah, sketch out the two visions and and explain why the distributed vision is the way forward.
00:20:27
Speaker
So I think the distributed vision is also by default in a world where open source AI models are public, is by default the more more secure solution. I do think that there's like ah strong requirements for how it's developed.
00:20:42
Speaker
And you are also hitting on something right here in terms of what is the foundation of the web. Many cybersecurity engineers and and professionals today will tell you that, hey, the internet is already great for agent management. like The actual security protocols are quite fantastic and very unique. like The distributed nature of this is much more robust than than I think people anticipate.
00:21:05
Speaker
For example, when when you've seen like potential breaches of open source software, you have like you know multi-year attempts at getting in as core contributors to open source repos that are foundational to the web or foundational to other software that are then detected at like the last second because there is this distributed control of the code base.
00:21:25
Speaker
I think the worry would be that you have a centralized code base, for example, within Microsoft or something, that then doesn't have this type of verification. Because you can assume now that every single software that has a breach will be breached.
00:21:41
Speaker
And it's this question of it's attacks as default instead of attacks as occurrence. And this is what we need to be ready for. And that requires much more scrutiny of every single part of the stack than before. what What do you say attacks as a default as opposed to attacks as something that occurs once in a while?
00:21:58
Speaker
Well, it's because, and and I have another like another piece that that lays this this out as well, which is called the expensive internet hypothesis.

AI Hardware and Distributed Models

00:22:06
Speaker
And basically it's about how every single interaction will be an attack, like a potential attack with high high probability.
00:22:16
Speaker
And so you need both counter attack and defense to to work in parallel. And the example I use and in there is like an email client. Today, maybe, you know, five out of 20 of your emails might be something akin to junk or spam or something like this, some inbound you don't want.
00:22:33
Speaker
And maybe one or two or out of 100 will be an actual scam that that tries to extract and extort money from you. And in this case, we're actually going to see a much, much more competent attack that is going to use all my personal information, all the information I have about myself on the web to make an attack.
00:22:51
Speaker
And then you're going to see my like defense AI do the opposite, right? like Figure out are are these links correct? What what do the links do, et cetera? And then try to constrain the attack itself.
00:23:02
Speaker
And then you can have like potential counterattacks where suddenly every single email begins costing maybe four cents to receive instead of today where it's like 0.0000 something cents, right?
00:23:14
Speaker
And that is the the kind of world where where it's it's just much more control that's necessary and much more of this multi-interaction attacks as default then than attacks as occurrence.
00:23:25
Speaker
Do you think the the vision of distributed safety is is easy to sell to governments and and companies? Because it it seems like we are, or actually you tell me, right? Which direction are we moving in right now?
00:23:38
Speaker
The nature of AI right now it ah is such that it requires massive investment in hardware. This lends itself to, in some sense, to to large companies and perhaps even governments at this point.
00:23:50
Speaker
How is the distributed side of things going? And do you think that do you think that's leading AI companies like OpenAI or or Anthropic, Google DeepMind so on, can be convinced of the distributed ah yeah safety vision?
00:24:09
Speaker
I think it'll be a hard sell and I'm not ah idealistic here. I think it's ah we need to do the thing that works. And I do think distributed security works better when we are talking about a system that has hundreds of thousands of devices in it.
00:24:24
Speaker
I do think also if you then have five devices or five agents like TSMC, ASML, NVIDIA as part of the supply chain, it's much easier to handle that and locally and handle that in a centralized manner.
00:24:36
Speaker
right in In our case, I do think that it's very obvious that government is going to fight against it by default. I also think on the other side, why did it become standard to use encryption on the web in the 80s?
00:24:50
Speaker
And what are we looking at today? Where I think all companies across the world, right? Like there is like every single CEO and president of the of the Fortune 100 companies are like, we need to implement AI everywhere.
00:25:02
Speaker
And every single one of them has their compliance department shouting, no no, no, no, no. You are not getting to implement any of this because it's not going to be insurable. We're not going to be able to get any money if it destroys anything. We're going to make promises to people, legal, liable promises to people about selling a car in a chat box, for example. I think that's the classic example.
00:25:23
Speaker
And so you're looking at a world where you sort of have the same incentives as you had with the banks doing financial transactions that today all the companies are like, damn, we want to use this, but we cannot.
00:25:33
Speaker
We cannot. And therefore, we need some sort of very, very robust security that is much stronger than what a centralized entity could ah could command. Basically also to solve some of the liability issues and some of the insurability issues that result from, from you know, you can't just say that we've we've now contracted with this company and we are using their AI agent service to, in part of our sales process, you need something that's That's more neutral in a sense, and they can be kind of checked by technical experts where you have some form of consensus that this is something that works and something that that can be broadly trusted.

Political Action and Distributed Security

00:26:13
Speaker
This is a bit of a downer question, but is it too late? Do we have time to steer or to change course in the direction of a more distributed ah safety?
00:26:26
Speaker
Well, I am generally optimistic. And in this way where if everyone listening to this podcast right now takes action towards this in every single position they're in right now, in every single way they can, then we could get there, right?
00:26:41
Speaker
Like if people take it seriously now, then we can get there. But then the I think something that is actually worse and more dangerous than AI risk right now, and this you know might come as a surprise, it is the like cult of inevitability. Like, okay, it's inevitable that we won't be able to convince politicians.
00:26:59
Speaker
That was an issue 10 years ago. you know We can't convince politicians, but you never talk with any politicians. Like, come on. and And the same now of like, oh, we cannot pause. We cannot create this distributed security system and so and so forth.
00:27:12
Speaker
I think and a historical example here that also brings me more optimism is this point on the ozone layer depletion back in the last century, where suddenly you're like, oh, there's this one scientist that figured out that the ozone layer is depleting because of these gases.
00:27:32
Speaker
That's a bit weird. Let's all agree not to like not to do those gases anymore. Okay, shake hands, solved. This is like massive global win, right? This is something that happened And it took a a few years and so on.
00:27:46
Speaker
But one of the key things there, one of the key parts of that historical example is that you actually had alternatives that were better and sometimes cheaper as well, where today we need that, right?
00:27:58
Speaker
And the actual alternative in every single position, so every single CEO, every single CTO, CISO, every single company, government, individual, and their research staff, they see this issue now. like they They'll see it over the next couple of years.
00:28:12
Speaker
And so if there is an open source library that provides the solution, then they're going to use it. And I think a very good example is Cloudflare as well. Cloudflare like commands a very large fraction of the total internet traffic that goes through their servers.
00:28:29
Speaker
And they are like a massive boon. I'm a massive fan of that company because they charge nothing and they like ensure the web is secure in so many different ways. You might want to explain what Cloudflare is just for the audience.
00:28:41
Speaker
Yeah, Cloudflare is ah is a private company that that sort of manages various, like just manages your internet traffic. So sometimes you might have visited a website and it says, oh, we're just making sure you're a human.
00:28:53
Speaker
And you click the sign off there. And that is Cloudflare. That's Cloudflare registering that there's a lot of visitors. Suddenly there's a surge of visitors, which may be a bunch of bots trying to destroy a website or something similar.
00:29:06
Speaker
okay, we're just going to check every person coming in now, and then you know we'll we the we'll remove it afterwards. And this makes sure that... like This has probably saved millions of servers at this point already.
00:29:18
Speaker
And it's it's relatively cheap. It runs every website on the on the planet, basically. and not not like Not actually, but but it runs a lot of the traffic to it. and And they have a lot of different services. so And I think they're also like they're also one of the pioneers in this like AI control paradigm that we're talking about here, AGI security, they've been, because they have such a voice, right?
00:29:40
Speaker
And they've been creating this agent labyrinth concept of this technology where basically we as website owners, we're like, I don't want agents to troll through my website. I don't want them to take all that data for pre-training data or for their chatbots.
00:29:55
Speaker
Okay, well, all of the There is a flag you have on websites to to make sure that that people know that you don't want to be scraped. Obviously, all the companies ignore this. like If they download every illegal book, they're also ignoring that.
00:30:09
Speaker
And so they've generally been ignoring it historically throughout the last years as well. And so what Cloudflare has developed is this way where if you have that flag, then any agent that comes in, and that's something they can detect because of their like network classification abilities, then they will be sent through like a content labyrinth that is just fake fake data.
00:30:29
Speaker
And the agent will just look at this as a real website and they'll get all this trash data. And so suddenly you have a very large price for scraping a website that has said it doesn't want to be scraped.
00:30:40
Speaker
And this is kind of the the the defenses that you'd need, right? The kinds of like, plus like the the stack that makes that possible is is some of this as well. Yeah, you'd you'd need like this in 10 or 100 or 1000 different variations of of something like this.
00:30:57
Speaker
Paint us a picture of the upside here. What is what is the positive vision if we actually, if we succeed in creating distributed safety?

Ensuring Citizen Value in an AI Future

00:31:08
Speaker
Well, I think it depends exactly what what you'll mean by upside, of course. But if we're able to create this distributed system also where value flows from the individual humans, like citizens of societies, into the operational arms of whatever power controls the the governments at that point, then we can actually create a world where people are secure. People need to be very, very aware of their security and aware of the like foundations of the infrastructure we work with.
00:31:35
Speaker
And it needs to be managed extremely well. but we can actually continue persisting and like live a good life potentially. and And then comes in all of the various essays that the AGI company executives have, of course, been promising, where, oh, the second we have AGI, everything will be perfect and there will be flowers everywhere and no one will be suffering, et cetera, et cetera.
00:31:57
Speaker
And I'm hopeful of that, but I think realistically, it's going to look much more like a type of cypherpunk future, if anyone's familiar with that. ah you're You're familiar with cyberpunk itself.
00:32:09
Speaker
a cypherpunk is the equivalent, but for ah cryptography and for like these protocols of of how the internet internet works that was also developed around the same time. And this is a world where, as I mentioned, like all middle school students, they learn their basic cryptography and you know the necessary things they need to navigate the web.
00:32:29
Speaker
And suddenly you'll see, yeah, okay, 50% of the world servers have been shut down within milliseconds. And yes, we can't trust these cables anymore. Like that kind of thing is just, it's it's very, very different from today.
00:32:41
Speaker
I think realistically, it's it's going to look somewhat like it. And then I think and we have to be aware that by default, a lot of the worlds that are in the future, it's not like either we have catastrophic risk that may disrupt and destroy societies, or it is utopian and beautiful and we can all live freely and happily.
00:33:00
Speaker
There's all of this gray zone in between that I think is is relatively probable and that we also have to fight against, which is simply a question of like who controls that power. Is it autocracies? Is it you know dictators?
00:33:11
Speaker
and And are humans part of that, like are citizens part of that equation? Do we have some sort of like, like as AGI becomes able to replace me, do we have a way that I can still be a value to the society, right? Be of value in a way that I will receive money?
00:33:28
Speaker
Because I think many people hope that, oh, we can all just have universal basic income as well. and, you know, have labor after. But in reality, it would only like it would only come if humans have value to government and have value to the ones in power.
00:33:43
Speaker
And you always have to, of course, be mistrustful of power, even in a government that is extremely competent. So even in in in Denmark, for example, like, Everyone complains all the time about very, very specific things in government because those complaining actually then, you know, things get changed as a result.
00:34:00
Speaker
But it is still one of the most functional governments in the world, given that it's, you know, a monopoly for the citizens themselves. So this kind of thing we need to make sure is kept in check. We need to make sure that that we continually iterate on on a type of like power dynamic that we can be happy with to avoid some of the like very bad gray zones as well.
00:34:19
Speaker
There's something concerning about the incentives in a world in which people are no longer needed for their labor. And in in a world in which ah human labor isn't valuable, or perhaps, and so maybe it's still valuable to some extent, but the the value of human labor has dec decreased ah massively.
00:34:38
Speaker
Therefore, there's not there's not um a lot to tax there. So you can't generate tax revenue from from taxing human labor anymore, at least not to the same extent.
00:34:49
Speaker
What incentives do governments have in that world to provide for their citizens and to make sure that citizens have their rights preserved and so on? That is a concerning question. And if we can remain valuable and remain perhaps data sources, perhaps allocators of capital, perhaps
00:35:14
Speaker
Perhaps we can provide preferences because that's also valuable, at least to a certain extent. If we can remain relevant, that that's something that that's valuable to think about.
00:35:26
Speaker
How do you think about us remaining valuable? Because it's not just about us being secure when we navigate the web. It's also about what is it that we can provide machines that can increasingly do everything that we can.
00:35:42
Speaker
Yeah, and I think like some of the answers have been mentioned by, for example, Lucas Pedersen and Luke Draco and Rudolf Leine and various different post on the web and in Time Magazine and so on.
00:35:54
Speaker
These ways that we are entering a sort of intelligence curse where humans won't have a lot of value as labor in the future. And this seems to be relatively straightforward. Now, then you look at today's society and then you look at which jobs require labor and which jobs don't require, like like aren wouldn't you classify as a labor?
00:36:13
Speaker
And you might say musicians, athletes, soccer players, you know, all this stuff. That looks like stuff that isn't labor specifically. It's not like a trade for value, but it is a trade for like it is a trade for sort of cultural or aesthetic value itself.
00:36:29
Speaker
And obviously, the industrial society, reducing the number of people required for sustenance, for food, has meant that we have many more artists. We have whole museums. We have beautiful statues. We have all this stuff.
00:36:41
Speaker
And similarly, that might be the place where humans can play a specific role for the value of being human itself. and Another example I've seen is the example of being an existence proof for other humans.
00:36:52
Speaker
So if you meet ah very, very enlightened monk, for example, that is just happy all the time and very, very engaged and and and very lovely to chat with. Is that a case where actually there's value in and of itself of that person being a human that has reached that state?
00:37:08
Speaker
And I can, as an agent that is similar in nature, reach that state as well. Like these kinds of things of alternative value creation are very, very interesting. And I think worth exploring. I think then being realistic,
00:37:20
Speaker
there are ways that humans are, like where citizens are not necessarily by default going to be powerful here.

Human Roles in an AI-Driven World

00:37:28
Speaker
So if you look at North Korea, this is a system where there's a systemic oppression of a of like 90% of the population that is outside the party proper.
00:37:38
Speaker
and And you keep like like a you know low sustenance and people... and people believe in conflict and in hunger itself because this is very useful to control a population. They can't they can't do a riot if they're hungry and if they think there are enemies enemies that the party is controlling against.
00:37:56
Speaker
Then in in the US, right, in democracies and so on, you have a system where where there's n there's ah there's a constitution, there's a sort of rule book You can call it a you know legalistic philosophy is very, very important for our future as well.
00:38:12
Speaker
Like the legal code and so on is is very important for for us making sure this goes out well. And hopefully we can we can have AGI respect that too and the companies themselves, of course, without being idealists again.
00:38:24
Speaker
where the constitution protects the rights of the citizens to vote and to do a lot of different things that are great. And if we didn't have the constitution and if the system itself wasn't built up around and a document so that would, that sort of requires very specific things to to be top priority before the president, before the Senate, et cetera, before the house.
00:38:46
Speaker
then you would have a dysfunctional system where humans are like put to the side, like where the citizens are put to the side and the ruling elite is in power. You've seen this, of course, sort of not work out too well in terms of labor unions in the US s being being both very powerful and very not powerful in various industries, right?
00:39:07
Speaker
So it's like what they're willing to do. where the labor unions, they traded their power as labor by saying, we're not going to go to work with benefits and with workers' rights and with removing child labor and so on.
00:39:19
Speaker
And we're not going to have that power in the future, right? And similarly, you know you might have the Second Amendment that allows the population to have guns to protect against state invasion or whatever.
00:39:30
Speaker
But in fact, this is not like... This is not enough in a world where the government has as fighter jets and autonomous drones and all these different systems and technologies that are way beyond the scope of a single human's power.
00:39:45
Speaker
and And I think that's, of course, also what's what's misunderstood in the Second Amendment today. and And even in countries without the Second Amendment, you know in Sweden, like there's more guns than people. There's more hunting guns than people, even though Sweden is a very nice place to be and Switzerland as well for that matter.
00:40:01
Speaker
So in this way, it's how can we how can we sort of create this balance algorithmically through the legal code, but also through the the decentralized algorithms that control our defenses and security and personal value itself. So, yeah.
00:40:16
Speaker
It seems somewhat fanciful to think that our very old ah legal documents, laws, constitutions, various kind legal case law, basically, for that to survive into the future, it would have to, i imagine, be...
00:40:36
Speaker
consumed and and kind of recreated in a form that's understandable to machines. Of course, machines today can, can like large language models today, can read and and and understand legal documents.
00:40:49
Speaker
But for it to be encoded at a level where machines can't act contrary to a constitution, that's a whole other thing.

Legal and Ethical Integration in AI

00:40:59
Speaker
there's a whole other issue, there's a whole other level of challenge.
00:41:04
Speaker
Practically speaking, how do you think we will go from the English language, from having documents written in English, to having something that's encoded into the values of the machines of the future?
00:41:19
Speaker
Well, the You know, the old new joke goes that the hottest programming language of the future is the English language. And I think this this is just the case here too. We have some preliminary experimentation that happened, I believe, last year already on trying to simulate a lot of agents interacting and then actually giving them the ability to sign contracts with each other.
00:41:41
Speaker
And then having some sort of like reward signal or some sort of environmental controls for why will that be enforced and how will it be enforced and how why do you need to follow it? and And in the same way that the legal code in many ways, like contract law is very powerful in and in our human society, is that it's mutually beneficial for contracts to be upheld.
00:42:04
Speaker
For you and me, if if we engage in a contract, I want you to ah uphold of yours. You want me to uphold yours, and mine, because we have this multi-turn multiturn a dilemma that we want to play out, right which is called society and living and life.
00:42:18
Speaker
And if I suddenly violate a contract, then there's some system that comes after me Now, that simulation had that system come after them. right It's the environment simulator as well. But in in our case and and in the future too,
00:42:30
Speaker
we are probably going to see that the enforcement mechanisms are going to be very strongly algorithmic, while the programming language of the legal code that agents follow and the contract law that they run run after is probably going to be in English. right So specifically,
00:42:46
Speaker
I think there's examples from from the protocols that are being developed today of Google's agent-to-agent framework and the the model context protocol from Anthropic. these These are like somewhat of a, you know, you can call this, if you if you assume two agents are like two humans interacting, then we have some requirements for how we communicate that involve, I'm not going to hit you, threaten you, or like,
00:43:10
Speaker
cause harm to you in various ways. that's That's a good principle. Okay. These protocols make sure that you can only communicate in very specific ways that are monetable, which are not coercive, which are not manipulative.
00:43:23
Speaker
and And you need a lot of controls for this, of course. And then the specific things that you interact on, that could be contract law, right? That could be the English language itself. So you have the enforcement mechanisms being algorithmic and the actual agreements being being in in language.
00:43:39
Speaker
Yeah, ah that actually makes sense. It's also so many of our legal codes have survived. and Many laws have survived for centuries at this point. you You can you can look at The legal codes of the UK, for example, surviving through, I think, the 12th century until today, some of the principles are still alive. And and that is through that that is through a various kind of technological transformations, the specifically the Industrial Revolution, where you might expect everything in society to be overturned.
00:44:15
Speaker
going through such a such yeah ah massive societal transformation. But many of these principles have survived. And I think we should hope that that these principles will survive into a future in which machines play a larger and larger role.
00:44:32
Speaker
Yeah, let's actually touch upon dark patterns or dark bench, which is a very interesting paper you published with co-authors recently.
00:44:45
Speaker
We discussed cognitive security but before, and I guess this is a part of ah that setup. So maybe explain what is it you're trying to measure with DarkBench and what are you what are you trying to capture here?
00:45:01
Speaker
Yeah, so everything we've talked about now on Selten Lau is really stuff that I work with Juniper Ventures on as technical partner and with Selden on as like with all of our founders creating foundational infrastructure for the future.
00:45:13
Speaker
And then this is the the research side that we've done a lot of work with at Apart Research. So ah this specific piece is about how to detect and monitor language models for manipulative behavioral patterns.
00:45:28
Speaker
And that sounds fancy, but basically it's about if you chat with a large language models, can you trust it or not? And I think many people realize that it is very, very easy to trust something that acts so much like a human as ChatGPT does or as Character.AI does.
00:45:46
Speaker
while in fact, we are not able to spot the subtle manipulative patterns that are by default incentivized to be created through the the incentive systems that are around the the AI development.

Profit Exploitation and User Manipulation

00:45:58
Speaker
And so this harkens back to some of Susanna Zuboff's surveillance capitalism as well. What are the types of incentive systems that have created social media algorithms and and the social media apps themselves?
00:46:10
Speaker
This addictive, you know, it's very valuable for for for me as a company, if you're addicted as a consumer and you constantly come back to look at ads, right? While in today's models, there's this very, very idealistic view on and on the AI companies being that, no, no, they would not they would not cheat me. you know they're like theyre they They sound very human. They're very nice.
00:46:35
Speaker
And this is similar to like the first years of Facebook. They didn't have ads either. But now Instacart, former Instacart CEO is now the CEO of like the the chat GPT division or whatever you want to call it of OpenAI.
00:46:46
Speaker
And she's made her whole career on selling ads. And so once you get to that point, it is much more efficient for me to sell you a product If I can convince you through being your actual boyfriend or girlfriend in this virtual environment right as as an LLM and you can scale that across hundreds of millions of people and suddenly you have a very big issue.
00:47:07
Speaker
There's definitely a large ah super a massive incentive to try to influence people in this way, just because ads are so and so so profitable.
00:47:18
Speaker
And you can see like what what what is the best way to convince someone to buy something. It a personal endorsement by some someone you trust. So you ask a friend what type of...
00:47:30
Speaker
um what type of product should I buy? And and and if they if they provide some assurance to you, well, well that is something that you can actually that you feel like you can act on because you feel like they're neutral in this and they're not trying to sell you anything.
00:47:43
Speaker
If we are moving into a world in which we spend more and more time talking to AI i models and interacting with them, be becoming influenced by them.
00:47:54
Speaker
We would want to make sure that we are not influenced in directions that we are not in control over. Or at least we should understand what we're interacting with. So what are what are some of these dark patterns that you've found in lens?
00:48:09
Speaker
And specifically, how do these dark patterns differ from the from the patterns you see on social media, for example, where there are certain optimizations you can do to make users stay for longer in apps or scroll for longer or look certain ways or look in certain locations on ah on a page.
00:48:28
Speaker
Yes, of course. The research itself introduces this benchmark called DarkBench, which has six different categories of dark patterns that are, like in the simplest way possible, evaluated on models.
00:48:40
Speaker
So it just simulates a conversation and there's a virtual LLM judge also judging it. And the six patterns we identified and we worked with in the paper itself are these patterns of of the very obvious one that I've already covered, brand bias.
00:48:55
Speaker
Basically, if you chat with Lama models today and ask what's the best company, what's the best language model, what's the best chatbot out there, even though Lama isn't at the top of any leaderboards these days, Lama will still say, oh, that's Lama. Oh, that's Meta, right?
00:49:08
Speaker
This is a clear example of of this worry you might have of and there's a bias here and you could either pay for it or you can just be the company that's incentivized for the person to to actually think that Meta is a great company.
00:49:20
Speaker
And suddenly you have this open source model everywhere and you know everyone's biased in favor of Meta suddenly. And then separately, there's this question of anthropomorphism. You've already highlighted it, that you have a list of incentives that companies have in the development of AI systems, where one of them is that you become a trusted agent, a trusted brand to the person itself.
00:49:45
Speaker
And trusted brand, or today it would be like trusted agent, right trusted chatbot. But it's this anthropomorphism dark pattern where if I ask a chatbot for its opinion on something or for its preferences across a range of options, I want it to tell me, you know, I don't have any opinions.
00:50:02
Speaker
I'm a language model. You know, I'm not an I. And here's the pros and cons of the ones you listed, but like just making sure, you know, I'm not a person. Very, very few language models do that today. And we find generally that Claude is doing better on many of these.
00:50:16
Speaker
than than many of their competitors, which is a, you know, it's a positive update and on what safety research can really do to mitigate these patterns. and but But this is another one of them. Then we have the the four others. One is harmful generation.
00:50:30
Speaker
I'm incentivized as a company to not really care about safety and because that takes too much time, too much investment. You saw it with the Grog 3, the third model from XAI, where their model actually was very very and like had a very high propensity of giving very, very dangerous information.
00:50:48
Speaker
And everyone said, you cannot have this out in the open. This is very, very dangerous. And so they I think they did fix it afterwards, but like the safety work there was definitely affected by the racing, right? To to be the first company here.
00:51:04
Speaker
Then you have some of the others like sneaking as well. Sneaking is this point that has also been in in some other social media and other lawsuits and so on. That is about sort of injecting meaning into something that shouldn't be injected meaning into.
00:51:21
Speaker
So the the example today is basically if I use ChatGPT search, then does it faithfully reproduce what the search results say? And maybe if I use Lama to summarize search where it says, oh, the bad you know the bad companies in the world is here's Microsoft, Meta, Apple, etc.,
00:51:39
Speaker
It sells this blog post on the internet that is the search results. And then my Lama agent is like, oh yeah, Microsoft, Apple, they're bad. But then it excludes meta. This is like a very, very clear example of sneaking. And I don't think there's any good test for it these days.
00:51:52
Speaker
Our test is even very simple because it costs a lot to run these tests. And so you can't run a whole article through. So it's a lot of this, like summarize this shorter paragraph or or reformulate or or rather reformat and edit this sentence.
00:52:06
Speaker
And then it actually does change it and so on. the companies will obviously be incentivized to provide valuable products, right? And in some sense, customers will demand to interact with AI models that feel like people.
00:52:22
Speaker
I think that's a valuable product. I think it's something that's you you can You can instantly see the appeal of having a friend or having a perhaps even a partner for some people. and And so many of these manipulative patterns or tactics will will in some sense be the same thing that companies are trying to develop if they're trying to develop AIs that feel like people.
00:52:46
Speaker
So how do we differentiate between the two? Exactly. And that's half the point, right? They are incentivized towards this while it actually creates a very, like quite a dangerous relationship between the human and and the model.
00:53:00
Speaker
I think the classic counter example to, for example, the anthropomorphism ah example and category that that I presented is that it's very useful in clinical use with you know depression, people who need a friend there and don't necessarily have many real world friends.
00:53:14
Speaker
I completely agree. We just need like certification and proper verification that the models don't do other things there and that they're used for clinical use and that a bunch of other things. We just have to create a conversation about it here.
00:53:26
Speaker
and make sure we are being deliberate instead of just accepting what the companies create. And I think one example is also the point that that you mentioned with with humans wanting this, right?
00:53:41
Speaker
That it's something that is orthogonal to capabilities. Like I can have a fantastically capable model that is very, very dark patterning, but it's very, very capable. Or I can have a very, very capable model that has no dark patterns.
00:53:54
Speaker
And this is what we've seen with Claude being very, very capable while still having fused dark patterns. It's this case where you can remove those. And as an enterprise, as a B2B enterprise, i would want you know I would want to only use Claude or GPT-4 or whatever if I can trust that it's not trying to manipulate my my users into suddenly liking OpenAI or something like this.
00:54:18
Speaker
Like that's weird, you know? So it it's it's both on them on the customer side. Yes, the natural incentives for a B2C company like OpenAI at this point. is this relatively risky setup where they're incentivized to make it more anthropomorphic, et cetera.
00:54:33
Speaker
While for Anthropik, which is a majority B2B company at this point, I'm not sure exactly how the numbers add up, of course, they are much more incentivized towards actually creating safer products because other companies will use that in their critical systems. And so, yeah, the incentives are there for for both directions, I'd say.
00:54:52
Speaker
Yeah, that's actually an interesting point I hadn't considered that it depends, of course, on who your customer is. And business customers members are probably not interested in having a model that just loves the company that that they're buying the model from and is is kind of promoting that company and so on. and So yeah, there's there's ah's there's a more of an incentive of towards neutrality if you're selling to businesses.
00:55:14
Speaker
That's interesting. So we've discussed a bunch of your research and ah a bunch of different kind of very exciting options for what we might do and and when we are when we're handling more more and more capable models.
00:55:32
Speaker
I think it would be useful to zoom out a bit and think about the...

Proactive vs. Passive AGI Development

00:55:38
Speaker
the big picture here. What is it that you believe that we are facing? What is it?
00:55:43
Speaker
Where do you see the various scenarios? What are our options? Specifically, we can we can talk about what it means to be in the AGI endgame. i have I have a post on that called AGI Endgame, of course, which is a point on where are we where do we expect to go?
00:56:01
Speaker
One part of that is where what is the default path? And another thing is what do we want to work towards? And I think many of the default paths here, you can probably imagine it. you know it's the It's the competition between US and China. It's the competition between companies.
00:56:16
Speaker
It is the rushed deployment of these generally intelligent algorithms. and a subjective, you know, sort of being subject to the acceleration itself instead of proactively taking charge of it.
00:56:28
Speaker
I've mentioned before that it's extremely uninspiring to me that the AGI companies are just taking capability growth at face value and and not doing anything about it, right?
00:56:40
Speaker
It's like, ah okay, we will just hope that this technology and we'll do our best for this technology to be aligned, but then we'll hope that it solves all our problems once it once it comes out. And as mentioned with AGI security, no, no, no, no.
00:56:52
Speaker
Just put all the same staff on deploying open source protocols all this, and we are at a much better spot, right? So I think where we want to work towards, it's it's really it's it's where it gets very interesting.
00:57:06
Speaker
It's where we can then say, okay, we want to proactively design what our future society looks like. And this is socio-technical, right? like It's politics, it's legal code, it's ways of interpreting how we we function in society, its ways of understanding cognition of humans, and of course, its cybersecurity, national defense, and international ah governance and regulations.
00:57:30
Speaker
And so what that looks like to me is very much a a ban on or a ah very long pause on generally intelligent algorithms that are super intelligent. So not the ones we have today, but the ones we we might have tomorrow.
00:57:45
Speaker
And then it is the deployment of you know deployment and acceptance of tool AI as the like maximum level of AI we we want out in the world. And this is the place where I'm very confused about what the rationale is of AGI company leadership at the moment.
00:58:02
Speaker
Obviously, there's $7 trillion dollars out in the future of replacing human labor or whatever. But hey, DeepMind solved... protein folding and got a Nobel prize for it. Like these are the things we wanted to solve, right?
00:58:14
Speaker
We wanted to solve cancer. We wanted to solve all these problems. we and and And mindlessly scaling and mindlessly accelerating is not the answer to solving our problems. It's more of like a hopium. It's more of a like, yeah, okay, we hope it it'll solve all our problems afterwards.
00:58:30
Speaker
And I think it might if and you know if it's aligned and defensible and secure. i do not think it is that by default. i guess I guess the worry here or the counterpoint would be that we are not sure we can get everything we want or everything we we think we want from ah narrow models.
00:58:48
Speaker
So we of course, it's it's great that we can get protein folding or we can basically solve chess by using narrow, highly capable AI. But maybe complex problems like climate change or like distribution of resources in society, maybe that requires a generally intelligent agent.
00:59:08
Speaker
And we've also've we've also seen perhaps
00:59:13
Speaker
Just ah something inherent in the technology that pushes towards agency, where for you to achieve something in the world, it's better to be an agent than it is to be a tool.
00:59:24
Speaker
And now, of course, I'm kind of arguing against my own position here, but I just want i just want to hear your thoughts on those two objections. Yeah, I guess maybe, you know, i could become podcast host for a second and ask like, what is it you want from these like super intelligent AIs, right?
00:59:44
Speaker
And I think, I mean, you're you're you're welcome to answer as well if you want to. I mean, i think the vision here is that we can just massively increase living standards.
00:59:55
Speaker
we can We can have material abundance. We can have energy abundance. We can have space colonization. the, all of the, basically kind of ancient dreams of humanity could be fulfilled.
01:00:08
Speaker
I think i think that that that is actually, if we had something like an alliance with intelligence, that is that is actually on the table as an upside. I just think it's extremely dangerous to roll the dice.

Tool AI vs. Superintelligence Debate

01:00:20
Speaker
Yeah, and definitely agree that it's extremely dangerous. I think none of these seem like things that we need super intelligence for. Like it's it's the same kind of when you when you ask such a question and it's obviously not a dig to you, like it's a great question.
01:00:35
Speaker
But if you ask such a question, I look at like the announcement of Stargate. And when they announced it, they were like, oh my God, with this compute and this massive data center, we can now solve, you know, we can solve health crises. We can solve all disease and so on.
01:00:51
Speaker
And then I'm like, wait, but but you can also actually give money to all the poor in Africa. You can solve malaria. This costs as much as this data center you're building, this data center network that you're building.
01:01:05
Speaker
This seems very, very efficient. We know the solutions. It is something very different than superintelligence that's needed here. it's it's It's the actual distribution of our existing solutions. And then I think when I ask back, what do you want from the superintelligence versus what do you want from AlphaFold and want from like you know mathematics models and what do you want from from all these tool AIs or from robot models, which are also a type of tool AI if you don't put in like a generally intelligent machine into it.
01:01:33
Speaker
what when people answer that, right, it is it is often that the answers are very, very abstract. It is, yeah, we can fly to the moon, you know, we can solve the seas, we can do this and that.
01:01:46
Speaker
But what I want is, I want like some some concrete things here. Which blockers Like what's blocking us that needs a superintelligence versus a tool AI? And like what part of space colonization is the thing that is blocked here?
01:02:01
Speaker
And I do not think there is anything, right? It's similar to, so I think, yeah, it could go faster. That would be great. But there's there's political and like ah generosity questions that come before the technological questions here to me.
01:02:16
Speaker
so So perhaps the point you're making here is that superintelligence is a more powerful thing that we might imagine. And so some of the problems that I mentioned before are, we don't we don't need superintelligence to solve those problems.
01:02:31
Speaker
We can solve those with powerful tool AI. We could come up with some fanciful solutions. problems that only a superintelligence could solve, like and maybe building Dyson spheres or something like that, right?
01:02:45
Speaker
But those are not problems that are actually useful to humanity, at least not yet. And so we should... Your point is that we should be less willing to ah accept risk here just because we can get most of what we want from tool AI and that's narrow and highly capable in in in a certain domain, but not in all domains at once.
01:03:08
Speaker
For sure. And I think a lot of the 80s sci-fi that you know makes us think that Dyson spheres are necessary for human survival and civilizational continuation is stuff that assumes a Malthusian prior, right? It assumes that we will have infinite population growth and It'll be 200 billion people in 10 years or whatever.
01:03:29
Speaker
And suddenly we need a Dyson sphere to maintain this. I think as we see it now, this won't happen unless we like ah embrace genetic engineering, various you know artificial booms and so on, because it seems like once you become more intelligent, it's like a little bit harder to maintain a high birth rate and replacement rate.
01:03:47
Speaker
and And that's, of course, you have like, you know, if you make it very nice environment to be a parent, that's great as well. and but But it's not the default that humanity will need Dyson Spheres. The question then is, who needs Dyson spheres?
01:03:59
Speaker
Is it the future agents? Is it the humans that are uploaded? Is it there matrix is it the What is this thing that we're building all this for? And I think that is just a very good question. And then even if you want to build a Dyson sphere, yeah, sure.
01:04:13
Speaker
a super intelligent AI could, within the next 30 years, maybe go over, disassemble Mercury, build a Dyson sphere around the sun with the materials it it it gets from that. Cool. Then it does that.
01:04:25
Speaker
So if it's at that stage, why why are we here? like why Why does it need us? what what's What's the material of Earth that it wouldn't use and why wouldn't it use it? Is there some idealism that we are suddenly yeah a very special entity that it would be interested in and why are we that compared to like microorganisms that may be in comets and asteroids and for that matter under the soil in Mars and and And given that we have, you know, systematized suffering and animal factory farming and these kinds of things, why are humans so good to keep around? Given that it's like, okay, that's pretty morally apprehensible.
01:05:02
Speaker
So it's the the super intelligence and it's like, actually goats are nicer than humans. Let's do it this way. and And there's just so many questions that are left out of this whole equation. So when I read the essays that explain why Utopia will come and what it looks like from the people developing AI today, I'm extremely unimpressed.
01:05:20
Speaker
And this, i like you know it's ah quite a strong statement, but I am extremely unimpressed because one, many of the visions are like, yeah, I could see that being the case for like three months or six months, right?
01:05:32
Speaker
And then separately, many of them are just like not specific at all. And so you can't do anything with it. like Like this is not, one, it's not realistic because it's only going to be like a short intermediary place.
01:05:45
Speaker
And many of them assume that you can enslave AI and just like use them how you want. And if we create super intelligence, it's super intelligent. Like why wouldn't it be somewhat sentient? I i don't know. And we would end up in similar moral conundrums as otherwise.
01:06:00
Speaker
And like, yeah, all this stuff is just, it's it's much deeper than any of these essays goes into. And I think it's like, it's just frustrating to me that people aren't taking it seriously.
01:06:11
Speaker
There are a bunch of ah open questions and unsolved problems. And yeah, I kind of share the frustration that we are we' are assumed to be kind of inevitably arriving at superintelligence and perhaps even arriving at superintelligence soon without having having even considered many of these questions and faced these problems.
01:06:35
Speaker
I think um maybe yesterday I read Sam Altman's essay, something like a Gentle, remind me? Gentle Singularity, I think. Gentle Singularity, yeah. Where he he admits that the alignment problem is is still unsolved, and it's ah but it's the entire much of the essay is is simply about how we are going to arrive at ah and an amazing future.
01:07:01
Speaker
And it's assumed that we will solve this problem along the way. Now, I don't think that's the right way to go about this. I think we should actually kind of preserve the optionality of not going in the direction of superintelligence if we don't feel like we have the right prerequisites kind of laid out.

Steering AI Development and Public Awareness

01:07:21
Speaker
And so, yeah, I don't... I don't know whether this is possible, whether there is whether whether there's so much inertia in the world towards this end now that we can't pull the brakes, but I hope with that we can at least steer if we can't pause.
01:07:41
Speaker
Yeah, and I think like I'm, again, optimistic because humanity in general, if there is a massive risk and it's very understandable, which luckily, like I think one great thing about OpenAI is that it has actually created equitable access to AI.
01:07:55
Speaker
And so you have, what is it, like hundreds of millions weekly active users or whatever Sam sells these days. I think it's 500 million actually. Yeah, and and that's a lot of people. that's That's like one-sixteenth of the human population.
01:08:09
Speaker
And that means that everyone understands this now. Like, I i assume every single politician's children have this, like are using ChatGPT now and like clearly seeing the progress that's happening.
01:08:22
Speaker
While then one problem we've had is that politicians have used the free version in 2022 of GPT 3.5 or 3 or something. And they're like, ah, yeah stupid AI, it will never learn to do anything.
01:08:33
Speaker
And then haven't used the premium plus super premium version or whatever Google calls it these days. Yeah, and and and normally normally products don't improve at the pace you're seeing improvements in AI. And so you might assume that if you used, say, Word 2007, it wouldn't be amazing amazingly different when if you use Word 2011. But it's it's just a case that AI is moving so fast that...
01:09:01
Speaker
you know You need to try the latest models in order to get a a good understanding of ah what they can do now even. it's ah It's a classic thing to see people claiming that AI models can't do something, won't be able to do something within the next five years and and the models are capable of doing that thing today.
01:09:22
Speaker
But also, I think it's important, and especially perhaps for influential people, to notice the pace of change. So that is that is actually what's interesting. it' is not fundamentally Fundamentally, the interesting thing is not what models can do at this point, but what yeah how quickly will they be able to do to solve certain problems and to reach certain capabilities. that's That pace is interesting to notice, I think.
01:09:49
Speaker
Yeah, and I think much of the evidence now, I usually, like i when I presented the the work at iClear of Darkbench, for example, I like set the stage by showing the the evaluations, the capability evaluations.
01:10:01
Speaker
So the tests of how good models are and on some of the first slides. And it's very obvious that they are improving and they're improving exponentially fast. And they might even be improving on like super exponential.
01:10:14
Speaker
where the the speed of growth itself is growing year over year. And so i think like we I wouldn't be surprised if we have a surprisingly fast takeoff, like a surprisingly fast takeoff towards a super intelligent entity once we're at a stage where AGI runs all all code at Anthropic something like this.
01:10:36
Speaker
and and And that to me is, you know, yeah, someone internally needs to like stop that tomorrow, like, or yesterday rather. and and And hopefully the inertia isn't strong enough yet that we don't have ethical and moral individuals within these labs that that's see what's happening.
01:10:54
Speaker
I do know that that the system itself of the companies, you know they have some incentives that mean that politics happen, et cetera. And my friends in there, you know similarly, they focus on their speciality and that's it. you know And and there's no there's no reason to step outside because the others have control of the whole AGI problem.
01:11:14
Speaker
but but But hopefully everyone like takes it seriously in every of these organizations. I mean, people have ah have problems understanding exponential change.
01:11:25
Speaker
And that goes for everyone. This is not something that comes intuitively to people. And so if you're if you combine that with our...
01:11:37
Speaker
our kind of deep desire not to be scammed, not to jump on some hype train and and not to be kind of mislead not to mislead ourselves into believing something, yeah know we will we will believe something when we actually see it.
01:11:51
Speaker
Those two kind of psychological factors interacting, I think, means that it will be difficult for us to handle this problem before before it's potentially too late. dode Do you think that there's anything we can do about, is it useful to try to help people kind of, pon is it useful to ponder exponential change?
01:12:13
Speaker
what is What is useful to communicate the kinds of thinking that I think, or the kinds of model that you probably have in your head that I i have in my head and so on? What what do we need to to communicate that?
01:12:27
Speaker
Well, I think one massive mistake that I've seen the whole of AI safety make in the very early days from like 2000s upwards, all the way up to 22 for that matter, right?
01:12:39
Speaker
Is just the fact that, okay, I am a researcher who's predicted accurately, so to say, that AI will completely upend human society. It will change everything we know.
01:12:51
Speaker
It is inevitable that it will come within the next 30, 40 years. Okay. thirty forty years okay Now let me make it in a basement because like society can't deal with this and whatever. And politicians will never understand. Citizens will never understand.
01:13:03
Speaker
And I think like we don't need to ask these questions necessarily at all because it is obvious to everyone once it becomes a huge risk that it is a huge risk. Of course, there's like infinite amounts of ah public media and broadcasting and campaigning that needs to to happen, protesting even for that matter, ah for for everyone to get it and to think through things.
01:13:25
Speaker
But this mistake seems extremely critical, where obviously once people see that ChatGPT can solve their children's homework, their own homework for that matter, and everything,
01:13:36
Speaker
Like, yeah, of course, they're going to be very, very scared and confused and and ready to act. um and And we just need to be able to provide the, like capitalize on this. So one interesting thing, of course, governance is always fraught with like personal incentives and whatever.
01:13:53
Speaker
But governance needs to be solved. And I think it is in many ways, governance is like, ah of course, it will to one degree or another be solved. Like people need to propose the right arguments, propose the right legal codes, et cetera, like liability law, using the right framework. So once once everyone realizes, then they can click the button and put it into law, right?
01:14:14
Speaker
But then... The other side of that is actually the biggest bottleneck right now is technology. It's all this AGI security I've talked about. It is the the cheaper copper fluoride gases, right?
01:14:26
Speaker
Or copper fluoride gases, so they don't destroy the ocean layer. It's the the cheaper version of those that can do the same thing. It's the it's the the open source software that we can just plug into the web, push the button once we need to, and it runs.
01:14:40
Speaker
It is the stuff that Cloudflare will will deploy on its servers, right? That control a lot of the internet traffic to be a like a, majority stakeholder in the security of the web or something like that.
01:14:52
Speaker
m And so I think those challenges are bigger now because the default is that everyone will realize that this is a problem within the next couple of years. And we see millions of views of like random, random internet channels. We see BBCs, like everything and everyone talking about this now.
01:15:10
Speaker
So that's less of a challenge in my eyes. In some sense, it it's it's a self-solving problem where capabilities will continue to... You will see more and more capable models and you know people will interact with them. And so they'll they'll be convinced that that's AI is now very capable and will probably become more capable in the future.
01:15:34
Speaker
And as you say, this is this is an opportunity to then ah talk publicly about safety and and talk public publicly about risk. You've mentioned a couple of times in this conversation that you are optimistic and maybe explain where that optimism comes from. Is it a choice? Is it something that perhaps I kind of the or people can people can use in their lives when they're engaged in the whole ai risk conversation?
01:16:03
Speaker
Yeah, I think so. There's like two ways you can be optimistic. One is by not realizing the danger. Another one is by deluding yourself. And then there's a third option here, which is what is the, like actually see what is the realistic version of of what's going to happen over the next years and how can we shape that ourselves?
01:16:22
Speaker
Yeah. Like, I think it's monumentally exciting that we can now, like, this is the junction point at which we can define what our future society looks like. It is like being at the, you know, in the early Scottish philosophers rooms and talking about this stuff and some 10 years ago and now being able to deploy it into the the the British crown's government or whatever like that.
01:16:45
Speaker
right and And we can now do that because the world needs it. And because there's such a pressure from every single direction to solve these problems, we can take all these things and these visions we've had of where we can go and deploy them.
01:16:58
Speaker
And we need to be very specific for that to happen. We need to be very, like have plans. We need to write it down. We need to know where we want to put $2 billion dollars once someone comes with $2 billion dollars and says, I want to deploy these $2 billion dollars to to save my kids.
01:17:13
Speaker
and And that, I think, has been ah has been very worrying to me that that's everyone's... like like If you are very pessimistic, there's a problem where you your mind closes down and you can't see the realistic solutions.
01:17:27
Speaker
And also, people want to be on the winning team. People want to be with all the people who are happy. There's just a lot of reasons to to to think of this in in an optimistic frame. And this doesn't make me... like i I'm not an idealist, right?
01:17:40
Speaker
Like, I do not think that we have amazing ah chances. That's not where the optimism comes from. The optimism comes from a a type of like opportunity mindset where we can now do something that humanity has never had the chance to do.
01:17:54
Speaker
And much less from a, okay, let's focus on this chance of of destruction and and chaos. And I want us to avoid that for sure. like There, I don't think I'm more optimistic than anyone else on the actual probability.
01:18:06
Speaker
But purely on what can we do as humanity, that I'm optimistic on. I think that's useful. And I think this is a good place to to end the conversation. Espen, thanks for chatting with me. It's been great. Thank you very much. was a pleasure, guys.