Introduction to AI Alignment and Safety
00:00:01
Speaker
Welcome to the Future of Life Institute podcast. My name is Gus Stocker. On this episode of the podcast, I talk with Connor Leahy. Connor is the CEO of Conjecture, which is an organization dedicated to scalable AI alignment. On this episode, we talk about AI safety and Connor lays out his case for how AI could become dangerous to humanity.
00:00:25
Speaker
We then discuss two potential solutions to this problem, slowing down AI development and regulating it. And we talk about why these solutions might not be enough. Here is Connor Leahy. Great. Connor, thank you for coming on.
00:00:45
Speaker
That would be back.
Understanding AI Safety and Control
00:00:46
Speaker
What is AI safety? How do you frame this problem? Because there are a myriad of different framings of the AI safety problem. There is different terminology. What do you find most useful here? So over time, I've become more and more pragmatic in trying to limit the scope of what we're talking about here. Because if you do things too expansively, it just brings a lot of back-edge to people like to argue.
00:01:14
Speaker
I think of it just currently, what I usually think about the AI control problem is kind of like, call it, or the lightning problem, the sense that I want an AI system to do what I want it to do, whatever that means. And a even more pared down version of it is the problem of controlling a strong system using a weaker system.
00:01:39
Speaker
where the weaker system in this context is a human.
AI Infohazards and Unpredictability
00:01:46
Speaker
Perhaps a good way to introduce the topic is to think about how could AI go wrong? What are some concrete scenarios? I know there are probably a massive list in your head right now, but how could it go wrong?
00:02:03
Speaker
Again, if someone asked me, Connor, could you please write down the top 10 ways I could kill a million people for under ten thousand dollars? I'd be like, well, I won't confirm or deny that I could do that, but if I could, I wouldn't do that.
00:02:16
Speaker
So there's a similar thing here where I can give various scenarios of various concreteness. But I don't know if you've ever read this great post, I'm less wrong. It's like Tylenol and terrorism or something it's called. It's a great post. Highly recommend it. I think Davis Kingley wrote it. I'm sorry if that was not the author. I don't remember. It's kind of like
00:02:37
Speaker
Tylenol terrorism and dangerous information, I think is what the post is called. And it points, this is one of my favorite posts about infohazards. It points to this point where there is a, I think in the seventies or eighties, I'm probably going to get some of the details wrong, but like the gist will be correct. There was a string of poisonings where an unknown person, to this day and don't know who did it, took open Tylenol bottles in shops and put poison into them.
00:03:03
Speaker
And so several people died. This was actually really bad. So this was actually a truly terrible event. And so obviously people freaked the fuck out. And like, there was a bunch of like copycats as well. Copycats is another interesting topic that we don't have time to talk about, but that's a whole other memetic danger topic. Um, and, but the interesting thing here is, is that that had never happened before.
Fragility and New Threats
00:03:24
Speaker
Like, like pills of this kind have existed for a long time. You know, it's like bottles, like they didn't have seals back then. So for decades, they didn't have these like seals that they have now. And nowadays when you open a bottle of pills, like it's like seal, they just like pull off. And you're not supposed to, like if it's damaged, you're like not supposed to take the pills. This is to a large degree, why? Is that because of these poisoning attacks that happened? So.
00:03:45
Speaker
there is this massive change in culture overall. And suddenly, and after the first person did it, there were dozens of attacks like these. There's all these copycats, there are dozens of attacks like these, where people try to poison medication and stuff like this in random acts. And this was an extremely effective form of terrorism. I think this is super scary. This is a super scary attack. So no one ever took credit for those attacks, which is also very strange. We still don't know what happened there.
00:04:14
Speaker
Um, but the interesting thing is, well, it happened once and then suddenly it kept happening. And like, you know, now that we have seals, it's like not much less of a problem, but it really seemed like just like one person came up with an idea and then it spread.
00:04:31
Speaker
And so this is a great example of an intro hazard. One of the things that I've really learned as someone who has a bit of a security mindset, as a bit of a hacker as a kid, I will start to break things and think how to get around things and whatever. How could I accomplish silly things? How could I get around security systems? How could I do things that maybe are not legal? And so on. I never did, of course. I was a good kid, of course. Never got into any trouble.
00:04:58
Speaker
You realize at some point that like a lot of things that are obvious to maybe you or me are actually not obvious to a lot of other people. And especially not bad people. Most bad people are like shockingly stupid, like truly, genuinely, shockingly unintelligent, especially like terrorists or like something like that are often just like shockingly uncreative and unintelligent. And.
AI as a Potential Disruptor
00:05:24
Speaker
giving them ideas might seem like, well, you know, I've already come up with this. So, you know, like, you know, surely someone else, no one else can, but often they can. So I'm not going to get my most likely scenario because my most likely scenarios involve human actors doing stupid things. And like, I can think of a concrete human actors where I'm like, yep, that's, they would do that. They're definitely dumb enough to do that. But, you know, maybe it's not been, but let me give you a, um,
00:05:52
Speaker
Let me give you a slightly, the class, the general class of scenarios. So there are scenarios in my mind about like super intelligence and like takeoff and nanotech and like all that crazy stuff, but none of those are my main line. I truly think those are actually distractions from what I think like the, the minimum viable catastrophe looks like. For me, the minimum viable catastrophe looks much more like
00:06:22
Speaker
If you talk to people who work in like intelligence services or in like security or stuff like this, and you talk to them about how operations actually happen and like how defend against things. So recently I was talking to a senior person in government and I'm going to give a concrete example because this is a fixed problem. So this is no longer, they have solved this problem. So it's no longer unsafe.
00:06:46
Speaker
This is like 10 years ago or like five or 10 years ago. And they were in like the office of the government. I'm not going to say which government or which person, but, uh, and they were talking to several like, you know, high ranking officials. I think like the head of state was there as well. And you looked at it as this big window, this massive, you know, open window. And he's like, what's stopping some teenager from flying a drone through that window, strapped with dynamite and killing all of us. Nothing. And they all just like, they all made excuses like, although that can happen, blah, blah, blah, but it turns out, nope.
00:07:16
Speaker
No one had any protection against that. No one thought about it. Anyone could have done that. And so the lesson to learn from this, so by the way, they do have anti-drone defenses now. That's why I'm saying that's not a problem anymore. So that government now has anti-drone defenses, so this is no longer a threat. It's still a threat, but they're aware of it. It was like five or 10 years ago.
00:07:38
Speaker
One of my fundamental parts of my model about AI danger is that the world is unstable. The world currently is extremely, not unstable. That's actually the wrong word. It's fragile. It's actually very resistant to small or medium shocks. It is not at all resistant against big shocks. So I don't think we've actually seen a big shock since World War II. And even World War II is like a.
00:08:02
Speaker
It's a big one, but it's not the biggest that you could imagine. There have been bigger ones in history, such as the Black Death or something. I would consider it an even bigger shock than World War II. But since then, I don't think the world has actually seen a truly big shock. Even the 2008 financial crisis is a medium at most. Sure, it was. And COVID, also medium. If you look at historical context, 2000 years context, COVID was as bad as the Spanish flu. It wasn't as bad as Black Death.
00:08:32
Speaker
We've had bigger ones that humanity got through just fine. So the world I think is fundamentally fragile in that humans generally build defenses against like, it's again, the black swan thing. They trade volatility for blow up risks. I know. So we'll couch things enough.
00:08:49
Speaker
that, you know, from day to day, mostly everything's fine. You know, day to day volatility gets evened out, no problem. Year to year, it mostly gets, you know, okay, you know, some war down in Africa happens, you know, some terrorism over in the Middle East, you know, but like, you know, for the most part, if you live in the Western world or like, you know, in Tokyo or something, you're fine. You don't notice it.
00:09:10
Speaker
We don't buffer well against, you know, decade or century level events. And an example of that is, you know, after COVID hit and it gave us a warning shot, as far as I'm concerned as what a really bad pandemic could look like, like a black death level pandemic could look like. Governments are now cutting back on their pandemic spending and striking down all bills to increase for future events. Hmm, suspicious. So the world is very unstable. I think it is at the point where
00:09:40
Speaker
A lot of, I was talking to a senior official at a government who also has a lot of connection to security services and such. And we were talking about this and like, you know, what holds the world together? Like, why don't people do all these crazy attacks we were coming up with?
00:09:59
Speaker
Is it simply because people are too nice or too good? Most people are either nice or good or incapable, and so we're relying on the vast majority of people simply not attempting these horrific connections. We got it. I mean, it's A, goodness.
00:10:15
Speaker
a goodness is just like most people don't actually want to hurt other people like not really or at least not randomly you know like most people want society to be stable most people want other people to be healthy and happy like you know maybe you'll hate like one guy and you'll be like fuck this guy i want to kill this guy but like it's very rare for people to like actually want to kill many people or like actually or like harm or maim many people that's actually rare it definitely exists but it's like quite rare and
00:10:43
Speaker
There's the other, and then another one is agency, intelligence, optimization. We're actually pretty well protected against people who want to maim and torture. They're not smart. They land in prison, you know? Like low IQ sociopaths, like there's this meme of this like, you know, sociopaths are these suave, intelligent, manipulative, you know, vampires.
00:11:08
Speaker
No, those are just the only ones you see. Most of them just land in fucking prison and you never see them again because they're just complete psychos. They start torturing animals as kids, start beating women by the time they're 14, start robbing people by the age of their eight or whatever.
00:11:27
Speaker
If you want to really see the dark world as it exists, you should look up juvenile psychopathy. It's actually shocking. I think a lot of people who don't work in law enforcement or psychiatry aren't aware of is just how bad it is. How there are some people who are just so
00:11:44
Speaker
incorrigibly evil, like truly incorrigibly evil, that there is just nothing you can do but just like throw them in a cage and just leave them there because they just cannot rehabilitate. This is really getting rare, like this is not like this is like a small percentage of the population, but they do exist and like ignoring that these people exist is I think actually very dangerous because it gives you an inaccurate marvel about how reality works.
00:12:04
Speaker
And so the point here would be that AIs would be different because AIs would be capable and perhaps, or they perhaps would not share our human reluctance to hurt other people. Exactly. Imagine this hypothetical.
Moral Constraints in AI Systems
00:12:22
Speaker
We have a system, an AI system. No, it's not really smarter than humans. No super intelligence. It doesn't even have to human level intelligence. Not even human level. It's not that smart. But it's pretty smart. It's like 90 IQ, 100 IQ. You can do some thinking. You can do some planning. But also, it's read every book ever written. It has perfect memory. It can be run at superhuman speeds and many copies in parallel.
00:12:46
Speaker
And you give this to the hands, you know, maybe this thing can't do planning very well, or it has like some problems or whatever, right? But then you just give it to a human or a group of humans. And now suddenly, imagine you had access to a group of perfectly loyal, you know, they never snitch, they never get tired, they never break, they never turn, you know, sociopaths, the complete sociopath, they'll do anything you want. They're not evil, per se. AIs wouldn't be evil, per se. They wouldn't be like, you know, sadistic.
00:13:16
Speaker
but they would be sociopathic. They would have no qualms. If you said, hey, maximize my chance of becoming president, and they calculate, oh, assassinating him is a good idea, why would the AI hesitate? It's just optimizing a goal function. So when humans think about
00:13:34
Speaker
Optimizing for goals, it's extremely implicit that we have these constraints, that we don't even, there's like taboo areas. Like if you want to become president, you wouldn't even think of killing the president. Cause you'd be like, no, I'd never do that. You could also argue on consequentialist rounds that like, oh, it wouldn't work or you get trouble, blah, blah, blah. But to assert to a large degree, also you just wouldn't do that. Like I would just not do that. I like, I like purely like deontologically, I would just not do that. But if you have a system that's only optimizing, well,
00:14:04
Speaker
So the danger I see from AI in the short term is not
00:14:12
Speaker
you know, super intelligence or like, you know, emergent things. Those might also happen. Like this is, those are strictly worst scenarios in the scenario I'm describing. The scenario I'm describing is the least worst scenario that still ruins everything. Well, what I'm thinking about is systems that are perfect. So as you pass that are just optimizing, there's a great post in the restaurant called optimality is the tiger and agents, RT, which is related to this as well, where.
00:14:39
Speaker
These can be systems that are not agents. It's not necessarily systems or agents. It's not necessarily that they're interacting with the world or whatever. These can just be very simple AI systems. They're just very intelligent. They're just optimizing for something. To optimize for something, they might just coldheartedly conclude, well, I've simulated
00:15:02
Speaker
how to kill the president. First, run this piece of code with an internet-connected device. And then that spawns some kind of agent or some kind of agentic process that does some kind of actions in the environment that are necessary to assassinate the president or whatever.
00:15:19
Speaker
So in this short term, you're mostly worried about these tool AI systems, such as, for example, GPT-3 would be an example of this, used by humans with goals that conflict with those of broader societies. So say, for example, we want to... No? It's worse than that. Okay. That's still one of the positive scenarios.
00:15:42
Speaker
There are even worse scenarios in that. It's even worse. That is a scenario. It's a minimum viable thing. Like this is the one most people agree with. I'm like, Hey, imagine you have like, you know.
00:15:52
Speaker
some intelligence service or whatever from a hostile nation and they have access to a group of never, you know, never sleeping sociopaths. They'll be like, Oh shit. So this is the existence proof that the world is unstable. Like every single person I've talked to from intelligence services or like security, if I told them, Hey, imagine your adversary had 100 perfectly loyal sociopaths that will do everything and are also as smart as Von Neumann. How would you defend against that? They would be like, we're fucked. Like, like that.
00:16:20
Speaker
Like there is no defense against that. That is like insane. So this is the existence proof that the world is unstable. Now things get even worse. So I don't even think you need a human. Like I, my default outcome scenario doesn't even involve a human doing this. This is like, if we get to this scenario, we're already one of the better timelines. My main line prediction is, is that we should die before we get to the scenario. What's going to happen is instead as we build.
00:16:48
Speaker
systems that continue to increase intelligence and generality. We have them training themselves or in the environment or whatever, gathering new data from the internet, playing in video games, simulation, whatever. DeepMind just released that paper of an agent that teaches itself how to collect diamonds in Minecraft and whatnot, that kind of stuff. And we just scale it up, scale it up. Oh, suddenly it has art and language and doesn't use hand axes anymore. And suddenly,
00:17:19
Speaker
I don't know, something weird happens. So my prediction is, so my true prediction, which is more higher shock level, which is like, you know, might make sense to you or the audience, but you know, it doesn't work as well on like, you know, government officials is then like, and then something weird happens. We have these systems that no one intended them to do anything. You know, intelligence didn't intend for humans to develop culture or to didn't intend for people to develop any certain ideology or preferences or opinions or whatever.
00:17:48
Speaker
But as they become more powerful, they're interacting with their environment. They're, you know, have these like discontinuous capability gains and like, you know, like those, you know, sums of S curves, whatever. So like, to be clear, I think all of this will look completely smooth from the perspective of loss. So if you look at like the loss graphs of these things, I think there will be no anomalies. I do not predict any anomalies.
00:18:09
Speaker
I think things will go totally smoothly as predicted. And then at some point, the difference between GPT-4 and GPT-3 on loss is not that massive. But it can do a bunch of crazy new things that GPT-3 can't do. I think that's just going to keep happening. And then people are going to start using them for benign tasks. People are going to start automating writing a system.
00:18:33
Speaker
Um, you know, it's like, you know, clerical work and, you know, stuff for coding, you know, stuff like that, like all this benign stuff. And I think this is all going to be completely benign until very suddenly. And then very suddenly we have these systems that start taking actions and we don't really know why or what they're doing, but it's like, we're like, eh, it's fine. Use some RLHF, you know, just kind of like, you know, it's fine.
00:18:59
Speaker
And what the term you just used, what does that mean? Sorry, reinforcement learning from human feedback. This is a commonly used technique at the moment, which I have strong technical disagreements with. Basically, the idea is you take language models or
00:19:15
Speaker
systems like this and you train them to optimize a model of what humans like. So yeah, like humans look at various outputs of the model and rate them, thumbs up, thumbs down. You make the model like output the thumbs up ones more and the thumbs down ones less, sort of. And this is sometimes touted as alignment technique. I do not think that is a very fair description of what the technique actually does.
00:19:39
Speaker
Because the problem here is that you don't really know which goals you're encoding in the agent. You don't know how the agent or how the AI model is understanding your thumbs down or thumbs up. Exactly. So this is encoding human preferences in the ontology of weight diffs, which is just such an alien way. I show you a 175 billion list of numbers of slightly changing floating point numbers.
00:20:06
Speaker
And I'm saying like, okay, cool. This is your preferences here in coded. Super legible. Well, no, like, of course not. Like what, what the hell am I supposed to do with this? This is an ontology. I understand. Why would you expect your ontology to fit into this? Like there's like some alien comes down from earth and it comes out from the space and he's like, ah, I've been looking at you guys. You guys really like for larval, don't you? And you're like, what the fuck is for larval? He's like, yeah, I got you guys. And then he goes off to do whatever that is. That's kind of like how RLHF is. It's like.
00:20:36
Speaker
So you can't understand what you've encoded by looking at the weights of the network. And that's a research paradigm trying to interpret these weights and trying to extract what information is encoded. But it's definitely lagging behind progress in the models themselves.
00:20:56
Speaker
Exactly. Exactly. So we, there's a, there's a fun little experiment. We did a conjecture where we looked at two different models that were trained slightly differently, but this is not actually our LHF. We thought it was our LHF. Turns out it wasn't at least assuming the open air is still in the truth, which, you know, basically we saw two models, two GPT models. And one of the models, if you asked it for a random number and you looked at the output probability over like digits, it was like pretty random actually.
00:21:27
Speaker
Not perfect. 42 was slightly more likely than others. But overall, it was a pretty smooth distribution. But then if you look at this other model, this instruct model, and you asked it for a random number, it would put almost all its probability mass onto two numbers. It had two favorite numbers.
00:21:46
Speaker
So this is really interesting. So what I interpret kind of is going on here. I don't think open AI like tried to tell the model to like these numbers. No, I don't think that's what happened at all. But I expect what happened is it's just that, you know, we're training on just like, you know,
00:22:02
Speaker
being useful to humans, just showing a bunch of examples, whatever. And for some reason, something about the thumbs up things had some weird correlations or some weird connections somewhere, something that for some reason, just made it really like these numbers, just really upvote these numbers. This was not intentional. So the interesting thing here is thought, oh, it has bigger numbers. That's like,
00:22:25
Speaker
It's kind of funny, you know, it's kind of harmless. And like, this has been solved in newer models, like they have much more data and they don't have this problem. So, you know, it's more of an interesting idea. The interesting thing here is not the exact example is that like, what else are they encoding?
00:22:41
Speaker
You know, what else are in those weight updates? Who knows? We don't know. It's an example of how seemingly random goals could arise in an AI model. And suddenly, the model starts acting in a way that to us looks weird. But it's because we've encoded some goals that we don't understand. And then the AI safety problem, the alignment problem, the control problem is then
00:23:08
Speaker
When these goals begin diverging from the goals of humanity or from the goals of the lab or company developing the AI, that is where things go off the rails. The way we've been talking about it here and the framing of the fragile world, I think
00:23:25
Speaker
I think it's a great way to frame it because it underlines how serious this problem is and perhaps how intractable it is. Do you think AI safety can be solved by humanity given the fact that we have
00:23:42
Speaker
We have a mixed record of containing dangerous technologies. We have extremely strict standards for nuclear energy production and for biological laboratories. And still, there are accidents. These have happened. So if the failure rate or if the security has been extremely high and the failure rate extremely low for this to succeed, can we succeed?
Challenges in Solving AI Safety
00:24:10
Speaker
There's no law of physics that forbids us from solving the problem, building an aligned super intelligence and having a wonderful future. There was nothing whatsoever that forbids this from happening. This is completely allowed to happen by physics. This is a path that is open to us. We have to be clear about this. This is a path that is open. It's quite interesting. We live in a timeline where we have not yet obviously lost.
00:24:40
Speaker
We're not in a timeline where everything's out of control. AIs are already spinning up and taking over governments or weird sociopaths are in power or whatever. That's not the case. The paths are clearly still open. We can still win. But we have to be realistic here. The truth of the matter is I don't expect us to solve it. No. I don't expect us to rise to the challenge. In most timelines, I expect us to fail. This is a type of problem.
00:25:08
Speaker
that humanity is especially bad itself. This is a, like, it's like, I sort of call it like level two epistemology problem. This is a problem that is not like a normal scientific problem where we're like an iterate over and over and like failure isn't catastrophic. And, you know, it's not a problem if we can, like, you know, if we don't have, you know, we don't get everything right in the first try. People are like mostly aligned on the same, on the same page and whatever. That's not where we're at. It's not where we're at at all.
00:25:37
Speaker
this is a much, much, much harder problem. This is a problem where if we get it wrong on the critical first try, that's it. No iteration. We might be able to iterate on proto versions of it. But this is less like the nuclear bomb and more like the nuclear bomb if it would ignite the atmosphere. And there was the thing in Los Alamos where they weren't short.
00:26:02
Speaker
If it was going to ignite the atmosphere, there was like a possibility that it could ignite the whole atmosphere and did like some crazy tech, like mathematics, like three days before the test. And it still gave them like a 30% chance it might ignite the atmosphere. And they still did it. Like imagine, imagine being in that room and being like, well, all right, it's only 30% that we will kill everyone forever. But I mean.
00:26:29
Speaker
The general sure is shouting, so let's do it anyways. This is the state of mankind. Imagine being at the state where you can both believe that that is 30% chance of the world ending, but the scary alpha ape is yelling at me, so I should do it. That's the state of mankind. Or perhaps you are too curious not to do it. That would be even more sinister in some sense.
00:26:57
Speaker
I think for a lot of scientists, that was the case. I think for a lot of them. So there's this truly darkly funny quote, I don't remember exactly, but it's from Von Neumann, where he's like, you know, the goal of science, you know, for scientists is to do science, no matter the consequences. It doesn't matter if it causes harm, it must be done because, you know, we must do it. And I'm like, what? Like, like, so in my personal life,
00:27:24
Speaker
One of the mantras I live my life by, one of the things I like to repeat to myself, to orient myself through life, is the saying, don't be stupid. Now, this is a very different saying from be smart. Because be smart is hard. I can't guarantee I'm smart. So, dear listener, I cannot guarantee to you that any of the things I'm saying is smart. I can't guarantee you that. I can't guarantee that I'll be smart in all scenarios.
00:27:49
Speaker
But I'm pretty good at not being stupid. I'm not perfect at it. I'm still stupid sometimes. But if you actually orient yourself around recognizing when you're being stupid and just not doing that, you can already outperform everyone by a crazy margin. Because most people do stupid things all the time. It's crazy how much time people spend just loading their guns and shooting themselves into the foot over and over again.
00:28:17
Speaker
And the same goes for humanity on a civilizational level, you think.
00:28:21
Speaker
Yep, absolutely. If you man, you would literally just shooting, stop shooting itself on the foot. I'm not even saying, you know, you know, become enlightened and ascend into the, you know, a truly, you know, beautiful society, which by the way, is also possible. Everyone's just being stupid. But even if we would just not be stupid and not shoot ourselves in the foot over and over and over and over and over and over and over and over again, if you could just do that,
00:28:50
Speaker
Man, man would I feel better about a future. But like prediction markets are still illegal in the US. Like we are shooting ourselves in the foot so hard every single day, all the time. It's in, it's just truly patterning. Like if the alien came down from space to be like, ho ho.
00:29:10
Speaker
What the fuck did you guys do? What would it mean for humanity to not be stupid in terms of AI safety? What should we do? What is what is the scenario? What is a how do we survive this if we were to not be stupid? So that's a great question. So I'm going to I'm going to give an answer for what we do if we were not stupid and what will we do if we were smart? That's what I want to hear. Yeah. So first answer, what we do for not stupid. Everyone would just kind of look at each other and like, huh,
00:29:37
Speaker
The CGI thing, you know, seems like the atmosphere, how guys?
Slowing AI Research - Feasibility and Challenges
00:29:42
Speaker
Yep. We should probably not do it, right? Yep. And so they didn't. That would be the non-stupid version. Like, you know, back to Alan Turing, people have predicted this. Like, this is not like some crazy new thing or whatever. It's like, it's obviously predictable. Alan Turing already predicted all of this, basically. And like, you know.
00:29:59
Speaker
like, you know, actually good and stuff like this. So let's let's sorry, but let's pause here for a second. Actually, do you think it's possible to slow down progress in AI capabilities research? So it's likely that we will coordinate do you think it's it's
00:30:19
Speaker
Won't we just hand off the torch to less scrupulous companies or labs? If, say, DeepMind and OpenAI began collaborating on slowing down progress in their research, wouldn't it just be then the next most advanced company taking the lead?
00:30:42
Speaker
So the original question asked me is, what will we do if humanity was stopping stupid? So you're now asking, OK, what happens if we continue to be stupid? Because obviously, if we were not stupid, the other labs would just also not do it. So those are two different questions. I just want to make that clear. So you're asking, OK, what do we do if we continue to be stupid? I can answer that question. I have many models here. I expect us to continue to be stupid, to be clear. And so assume we continue to be stupid.
00:31:13
Speaker
So there's two parts of this, I guess, two main parts I'll put into this. First part is, will others catch up? How much will they catch up, et cetera, et cetera? Second, does this question even make sense? First answer.
00:31:26
Speaker
I actually genuinely think that no one other than the top labs are relevant. I don't think China's relevant at all. I think it's a terrible meme. They're not going to catch up. They're so far behind. Who cares? I think this is actually genuinely a terrible meme. This is a really insidious, insidiously bad meme. It's relevant for longer timelines. If we're dealing with longer timelines, I mean,
00:31:53
Speaker
I recommend everyone just start solving politics because, oh, no. If we're in short timelines, it doesn't matter. Well, politics matters to me. China, Russia, whatever. Russia obviously doesn't matter. They obviously don't have the capabilities. And people have these really weird orientalist mindset around China. There's Boogeyman that has all these capabilities. If you actually look at China,
00:32:19
Speaker
really look at it and how science is done there and the bureaucracy and whatever. China is one of the worst places in the world to do science. Doing science for China is a nightmare. It's so bureaucratic. It's so slow. It's so ideology suffuses every part of it.
00:32:39
Speaker
They are like so much of it is marketing, like just like the Chinese government markets itself as good as science. Super, super hard. And they have a whole large population, which includes lots of brilliant people who are succeeding despite the Chinese government.
00:32:52
Speaker
Like, to be clear, a lot of the smartest people in the world are Chinese. Like, there's a lot of truly brilliant Chinese people. And, you know, they succeed despite the Chinese government, not thanks to it. You know, it's just because of their, you know, the perseverance of the human soul and mind. These people can do great work. And I would never, and most of those go to the US do their good work because, you know, if you're a smart, brilliant Chinese person, you know, student or whatever.
00:33:17
Speaker
Why would you want to stay there with all this bureaucracy and politics and all this bullshit? If you go to the US, you make tons of money and you can do your research much better. So if we can rule China out as taking the lead if the US were to begin not advancing AI capabilities, what about just say the second tier of American AI companies, say perhaps Facebook or Google Brain or something like this?
Competition in AI Development
00:33:44
Speaker
Yeah. So there's several factors that go into that. One is, is that they don't actually want to kill everybody. Like, like, and they are mostly, you know, like not insane. Like they might be like not super convinced and they might be like normal amounts of irrational, but they're not like politics amount of irrational. So like.
00:34:08
Speaker
You can talk to them. You can just talk to Mark Zuckerberg. He is weird and whatever, but he is a person you can talk to. And also, these are all people who do listen to the US government. If the US government said, no more AGI, there would be none.
00:34:30
Speaker
Of course, that's not what would actually happen. My actual models are more that government takes over and starts ruining everything because governments aren't capable enough to do this, like actually non-stupid things. So this brings us back to not being stupid. So in the real world, this is not impossible. Governments have levers to do this. It is a thing that is like within their like jurisdiction and like
00:34:53
Speaker
power, but it's not within their executive ability, I think, to at least not on short timelines, or at least not effectively. I hope I'm wrong about this and this is something we can like figure out. So actually it's worse than that. I'm actually very skeptical about governments trying to slow down AGI.
00:35:14
Speaker
I think this is not a good idea, or well, it's tricky. But the reason I think this is because of my model of governments and how incompetent and like, and like, you know, internally inconsistent they are. In a sense, having labs in control is not ideal, but it is, there are more concentrated points of coordination. It's easy to coordinate with DeepMind than it is with US government. And
00:35:42
Speaker
Whether, if we have longer timelines in five years, even maybe on the five year timeline, it's not going to matter because governments are going to get involved. Obviously, like obviously so. There's no way, you know, like intelligence services are just going to let the state go on. If like, if we have like, you know, 10 year timelines or 20 year timelines, no way intelligence services are, I mean, I'm pretty sure all these organizations are already infiltrated by intelligence agencies.
00:36:07
Speaker
I would be surprised if intelligence agencies were listening to us right now. I've been cited in the CIA paper. Did you know that? I didn't know that. I have. Yes. That's probably fine, right?
00:36:24
Speaker
One thing that government, especially the US government is very good at is reacting to national threats. That is like the only thing that makes the US government get it shipped together is when the Pentagon says this needs to get done, then the US government can actually do things.
00:36:40
Speaker
If this continues to, you know, babble along as like, as like future is nerd kind of maybe economics thing, then I don't expect governments to do much. If this becomes like a national threat type scenario, then I expect the behemoth to lumber into action. And I expect it to break everything by default, unless we, you know, somehow
00:37:03
Speaker
direct to this and help the behemoth make a non-stupid choice. I don't think you can make a smart choice. I think it's genuinely possible for governments to be smart. I think it's just too complicated. But I think it's possible to make governments be not stupid. And would that be by a piece of legislation governing how the governments would govern AI? Or what are you thinking in terms of helping governments not be stupid? The truth is that the government is composed of people.
00:37:32
Speaker
It's composed of, you know, lots of civil servants, politicians and bureaucrats and generals and, you know, intelligence service agents, so on. These are all people and these are all people you can talk to.
Government Role in AI Alignment
00:37:46
Speaker
These are all, I've been surprised by how much I have like, you know, just like contacted some government officials and they were quite happy to talk to me.
00:37:54
Speaker
And I could just answer the questions. And that was quite nice. And I'd like to do more of this. This is a thing I've updated very heavily on over the last year is how much, again, trying to make these people act smartly in this scenario at scale, maybe individuals. I've met some pretty smart people in government, actually. Some not. I've met some pretty smart people. You can't do that at scale. But you can help people be not stupid, because it's also in their interest to not be stupid.
00:38:24
Speaker
So I think the things, for example, that I am doing, and that would be very interesting. If any of your readers, listeners out there are working government, intelligence services, whatever, and you're trying to understand these problems better and how to do this, my email is open. Please let me know. I would be happy to help. I think this is important to inform people, give people better models of these kinds of things, how we can reason what kind of things.
00:38:50
Speaker
I don't think there'll be one piece of legislation that fixes all of this or whatever. I think the way government really works is not like that. It's a lot more backroom deals, dinners, who knows who, who owes who a favor, what are the lines of various parties and whatever. And I think the ideal thing I would want is that just to have
00:39:14
Speaker
Everyone kind of be on the, on the not stupid side. Just be like, yeah, let's, let's like not be stupid about this. Let's like, I'll take this seriously. Let's like fund alignment research. Like Jesus Christ, please. Like, let's just like have DARPA pull, you know, push $10 billion into alignment research. Why not? Like this is like for a government that is like, yeah, they can, they can make this happen. Like.
00:39:36
Speaker
The number of people that need to sign off to get like a hundred million dollars onto like a project is like shockingly low. It's like, depending on the government, it could be like two people.
00:39:49
Speaker
Let's just do that. Like there's like 200 people maybe in the whole world working full time on alignment. If governments just like said, even just said, this is a national priority, even if they don't like put funds into anything, basically all like massive amounts of academia will just like lumber into motion.
00:40:08
Speaker
And because it becomes high status, now it becomes the thing to be working on. It becomes a legitimate, real scientific problem, you know, like, you know, rubber stamped, you know, this is a, you know, you are a high status person for working on this.
00:40:20
Speaker
Cool. And just that would cause, I think a massive shift in like how many people take this problem seriously and like how academics are taking problems seriously. There is a massive risk in doing all these things because whenever you get lots of people involved, you know, politics gets important. So ideal world would be, of course, you know, Oh, turns out alignment is super easy and we just salt in our basements and here it is and everyone's happy.
00:40:48
Speaker
I don't expect that's how things are going to go. So if we're realistic, you do some real reality here, then what we need to do is to be realistic that people will get interested. This is of interest to everyone. This is of interest to governments, intelligence agencies, academics, everyone. Let's help people not be stupid. Let's talk to them. Let's be friendly.
00:41:10
Speaker
Let's say that DARPA began funding alignment research to the tune of $100 million or whatever, pick a number.
00:41:21
Speaker
Could this be counterproductive because the money would be used to increase capabilities? Isn't that a pretty live danger? If we look at, for example, critics of OpenAI will tell a story in which OpenAI was founded with safety in mind, but then increased capabilities of AI and thereby increased AI risk. And perhaps the same thing could happen with a government grant into alignment research.
00:41:51
Speaker
Of course, this is the default thing that happens. Everyone's stupid, remember. Being smart would be just the government nationalizes everything, creates the alignment and happen project, solves the problem, we're done. That's the smart solution. We're never going to get that. That is like, don't even think about that. It's not possible.
00:42:08
Speaker
Yeah. So it's not, it's not, it's not that interesting, interesting to discuss what we were doing in a very smart world. So we're trying to avoid being stupid instead. Yes. We're trying to, okay. So like, so I'm thinking about like pre-two improvements here. I think I'm not stupid. We are currently in a world where there is no government funding for alignment. This is stupid. Like it's not just like not smart. It's also stupid. If we had like, okay, $10 billion in alignment research and it all goes to, you know,
00:42:38
Speaker
something irrelevant or something dangerous, I'm like, okay, that's not good. But this is a less stupid world. And I expect a world that is already at this point will be more amenable to interventions that help safety and alignment than the ones we're currently in. Of course, because it's backfire. Yes, obviously, like anything involving, you know, lumbering behemoths, you know, tends to involve, you know,
00:43:01
Speaker
blowback risk. When a giant monster is attacking Tokyo and you summon Godzilla, Godzilla could probably beat the monster, but there's going to be a lot of reconstruction costs by default, even if Godzilla is the good guy. How could intervention go wrong here? What are ways in which the US government could break things?
00:43:20
Speaker
Oh, God, do I even want to give them ideas? I mean, plenty of things. I mean, obviously, as you just said, the obvious one is just, you know, funding capabilities work, obviously. The second one, which I think is just extremely likely to happen, is just they only fund military applications. If they don't actually fund safety, they just fund like, okay, how do we maximize, you know, military applications and whatever? And that obviously kills you.
00:43:45
Speaker
Another one, I actually talked to someone about this a while back. He said something very interesting. I was talking about interpretability research with them, which is a very common topic in AI alignment research. He basically said,
00:44:00
Speaker
So he worked for the military industrial complex and he said that the number one thing currently holding military back from deploying AIs at wide scale is lack of interpretability and accountability. So every increase in interpretability increases the military adoption of AI.
00:44:20
Speaker
This is, I think, something that a lot of people in the safety world do not consider when they consider the cost-benefit analysis of interpretability research. I do. I still think it's worth it. But it is something that you should have in your calculus. So these are some obvious ways that the government can mess it up. Another obvious way they can mess it up is to politicize it. It becomes a left-wing, right-wing, red-blue team issue. That's another way things can be stupid.
00:44:49
Speaker
Like the non-stupid thing is, well, obviously this is not red or blue. This is like, you know, we're all it's all in all of our interest to, you know, be able to control our technology. Like no one benefits from this not to be case. So, you know.
00:45:05
Speaker
Let's just not be stupid about that. Unfortunately, this is the kind of thing humans tend to be really stupid about, like climate change or whatever. Let's go back to the question of government grants for AI alignment research. Another thing that could go wrong here is that AI alignment research becomes a buzzword and it comes to mean something else than what it originally meant.
Funding and the Risk of Misallocation
00:45:27
Speaker
And it becomes a way to attract funding to your existing projects and so on. Is there a way to avoid this? Is there a way to be strict about what you're trying to fund without it drifting into becoming too broad and coming to mean something else? No, that would involve being smart. Is there any hope here for what could we do to constrain what we're trying to fund?
00:45:55
Speaker
Oh, I mean, there's lots of marginal things you can do here, but actually I think this is a massive mistake that a lot of funders have been making here is that, um, so this is actually a genuine critique I have of like EA and like, you know, AI safety funding. Is there extremely risk-averse?
00:46:11
Speaker
Like they bill themselves as like, Oh, you know, we're funding the crazy stuff. No one else's funding or like, you know, we're willing to do things, whatever, but they're not actually like DARPA is way less risk averse than like, you know, open fill or, or whatever. Right. Understandable also is DARPA has way more money. So like, you know, this is not, this is like an understandable thing that, you know, maybe open fill would be more conservative because they have much less resources and you know.
00:46:37
Speaker
DARPA can do all kinds of crazy things. But I would expect that... So when DARPA funds things, they fund a lot of crazy, stupid bullshit. But OpenPhil funds one person that turns out to be controversial or does something stupid or whatever, and then people are like,
00:47:01
Speaker
using my donated money to fund this guy? Like, that seems like an inappropriate use of like, like, no good deed goes unpunished. Like if DARPA funds some guy making an invisibility cloak or whatever, right? You know, it's DARPA. I mean, whatever, they do weird mill tech stuff that doesn't work.
00:47:21
Speaker
But if a philanthropic organization, you know, Bill Melinda Gates Foundation funds like one company that turns out to be shady or whatever, everyone's, of course, immediately on their case. No good deed goes on. It's really funny how.
00:47:34
Speaker
If you were a rich billionaire and you're just selfish, you don't really get criticized for that very much. It's kind of like, you know, like, yeah, obviously, but it's like the, it's called the Copenhagen interpretation of ethics. If you're a billionaire and you try to solve a problem that everyone, and you fail to solve the problem, you get way more shit than all the billionaires that did nothing.
00:47:53
Speaker
This is, again, humanity shooting themselves in the foot so abysmally hard that you should get credit for trying to solve a problem even if you fail. But the exact opposite happens. There's a large reason of why people are so low edge intake is because trying and failing gives you more social negative than not even trying. So this is a massive problem.
00:48:17
Speaker
So when I think about government grants, yeah, I expect most of it to go to bullshit and I expect most of it not to going to work. But, you know, if I wanted government grants, the way I would want it to go is DARPA type. DARPA is very different from like how like other grant making works. If we do the other grant making also fine. Look, I also think there's ways that would help, but ideal case would be like DARPA, like high risk, like weirdness, because we don't know how to solve alignment.
00:48:45
Speaker
Alignment is not like a low risk. Like, man, we just need this amount of money to build the alignment machine and then we're fine. No, no, no, this is blue sky research. Like there's like, you know, we, if you look at like, you know, current alignment approaches, you know, some are like pretty simple and reasonable. Others involve like, you know, retro causal, you know, a causal decision theoretic, multiversal, you know, decision theory simulations or whatever. Right. And you're like, is either like.
00:49:15
Speaker
Is either of these going to work? I don't know, probably not. But sure as hell, someone should fund them. Like someone should try. Is there a way to earn money by solving linemen? I mean...
00:49:29
Speaker
Uh, depending on what you mean by that, obviously, yes. I mean, in the sense that if you solve alignment, you make infinite money. Obviously, like you solved everything, like the thing holding back humanity's progress or the thing, the limiting factor, the bottleneck on human economic progress. It wasn't the only bottleneck, but the biggest bottleneck intelligence is that if everyone was just twice as smart, Oh man.
00:49:58
Speaker
Could you imagine? Could you imagine you're just like the median human at like 200 IQ? Could you imagine what society would be like? All the policies, the efficiency, all the science, the social systems we could build, the coordination technology we could implement. Could you imagine? And this is just like a modest increase in intelligence.
Impact Grants for AI Alignment
00:50:24
Speaker
you know, if everyone was like, this is still like human range, like like 200 IQs, like still within human range, like there are people that are that smart, you know?
00:50:31
Speaker
So that's still not as good, anywhere near as good as things could be if we have AI or super intelligence that is running things and developing technology and coordinating and doing economic activity and whatnot. We could have everything. If you have an AGI and it does what you want it to do, I mean, price, tell it to cure cancer. Tell it to trade infinite money on the stock market. Tell it to just...
00:50:57
Speaker
No more wars, please. Just, like, go negotiate with everybody and solve all politics.
00:51:02
Speaker
Yeah, I was simply interested in what the best funding model for solving this problem is, whether it's nonprofit or government grants or whether it could be a for-profit company or some new legal construction, like limited profit, open AI style. Great question. This is the thing I've often obviously thought about a lot. And so, I mean, if we were a smart society, we would have, you know, like, you know, impact grants. What is an impact grant?
00:51:33
Speaker
I don't know exactly all the exact details that you'll do. This is a new funding instrument where basically you can like, you sell impact certificates. I think that's what it's called. Actually, maybe I'm thinking of impact markets, not grants, one or the other. And like, basically you're creating a new charity. It will work the following way. It involves the following people. And now you sell shares in the impact.
00:51:55
Speaker
You say, I'm selling 100 shares or a million shares. They're the philanthropic benefit of this existing. So then people who think this is good and this should exist can buy the shares to fund your operation. So this would be an example of something that a smart society would have. This would be super common. And then the price of these certificates rise when there's more impact from the company whose shares we're talking about.
00:52:23
Speaker
Yeah, and then you can also do more. This is just an example of a simple system. Again, showing just how humans are so far away from smarter societies who have even better systems, even better common funding, you know, credit added funding and common goods funding kind of research and
00:52:44
Speaker
Let's just take one objection to impact grants, which is to question whether we can measure impact appropriately for these grants to work and objectively in a sense. So there's this information that's out there objectively that we can trade on. Do you think that's possible? Or do you think that's likely to happen? In a smart society, of course, that's possible. But we're not in a smart society. So this is an example of a technology that only works in a smart society. It doesn't work in us. I don't think impact markets work in our society.
00:53:13
Speaker
They just don't. It's not that they fundamentally can't work. They just don't work for practical contingent reasons that could be overcome in the future, but in the way things currently are, no, no, no, they just don't work. So what do you actually want to be pragmatic?
00:53:29
Speaker
I got a fund and this is like, you know, so for listeners who don't know, I run an AI alignment research company, Conjecture, and we are a for-profit company and that's not a coincidence. The reasoning for that is quite simply that that is the best form I think currently to be able to raise large amounts of money and to be able to have an ongoing supply of money to fund research and do these kinds of stuff. If you don't have like some crazy billionaire backing or something.
00:53:55
Speaker
Even there, there's problems like diversification and stuff. The truth is that if you look at how our society allocates resources, money, power, et cetera, the currently, again, this is a contingent truth. This is not an ideological statement. This is just a contingent truth about how reality currently is set up, but this could change in five years.
00:54:16
Speaker
is that currently the vehicle through which it is most likely to go from zero dollars to a billion dollars in a short period of time is a startup. This is a contingent truth about how our markets are currently set up, you know, 50 years ago, that was not true, you know, and maybe five years from now, that won't be true. But currently,
00:54:37
Speaker
VC markets over startups that are build, you know, software based products that, you know, are useful, you know, scale very, very quickly to very, very many users are the most effective way to gain very, very large amounts of resources.
00:54:51
Speaker
this, you know, unless you happen to have some kind of weird other scenario, but like those aren't scalable for the most part, because also our markets are way more robust and way more diversified sources of funding. And so on then, for example, having one billionaire, the euro patron, that might, you know, might be great. And, you know, if some billionaire wants to come by and, you know, hand us a billion dollars, happy to talk, but, um,
00:55:19
Speaker
Practically, as we saw, for example, with the FTX scenario, there are blow-up risks, let's say. Yeah, and so perhaps the main objection to the for-profit model is that the incentives won't be properly aligned to do the actually societally beneficial thing. You will be pushed into doing the profitable thing as opposed to the good thing.
00:55:48
Speaker
Yeah, but that's not a property of for-profits. That's a property of how we as society assign credit. If we had impact markets, if we had benevolent billionaire patrons or whatever, then we would be assigning credit differently as a society. Money is fundamentally a credit assignment mechanism. It is a reinforcement mechanism. It is a mechanism that gives subparts of a computational graph ability to do more computation, to do more actions.
00:56:15
Speaker
We reward people, ideally, by, you know, credit assignment is a fundamentally hard problem. Like credit assignment is extremely, extremely hard. And it's, you know, it's massive rabbit hole there. And the fundamental, like, the fundamental, like, you know,
00:56:33
Speaker
insight from mercantilism first and later, capitalism is how we do this credit
Capitalism and Alternatives in AI Credit
00:56:39
Speaker
assignment. There was other economic systems that do credit assignment differently. The fundamental progress of capitalism is the idea of how we assign credit to people's capital or labor or whatever in certain ways. You can agree or disagree whether this is the right way to do this, but it has been a very efficient, it's the way our things currently work. It's the most efficient system we currently have.
00:57:03
Speaker
Now, capitalism is, in many ways, natural. Like, you know, you give people money for things you want. Trade is a very natural fundamental thing to build a system or a credit assignment on top of. Not perfect. For example, capitalism has large problems pricing externalities.
00:57:18
Speaker
And like commons, this is a failure mode of capitalism. So I would expect a very advanced alien society would not be capitalist. They would definitely not be socialist, but they would be like some third thing, you know, they would have some kind of prediction market, you know, commons trading, you know, based systems, you know, some like Robin Hanson designed economy or whatever, you know. And.
00:57:40
Speaker
So the fact that there are these incentives are contingent, a not stupid society would have different mechanisms. They would have different incentives.
00:57:55
Speaker
There's a great essay. I forgot who wrote it. And it was like, incentives aren't the problem you are. It's like, if you have bad incentives and you act on them, well, you're bad. You did the bad thing. Sure, there's some amount of exculpation here in the sense that you can say, well, I didn't have full freedom here. There was incentives, blah, blah, blah. But ultimately, you took the action, dude.
00:58:20
Speaker
It's true that there are systems that are so corrupt or so, you know, 1984 or oppressive or whatever, you're just fucked. In that case, yeah, you're fucked, obviously. Like, you know, like, what do you want me to say? Like, yeah, if you're, if you're controlled by some like crushing market force or authoritarian regime or something, and every time you try to resist, you get shot or you, you, you starve, then yeah, you just die. You just fail. Like, yeah, obviously it's.
00:58:51
Speaker
surprising that we have as much freedom as we do. People are allowed to spend their resources in weird ways. People are allowed to be eccentric to a large degree, not infinitely and very differently depending on people. The fact Elon Musk is allowed to exist is like, I mean, just look at the guy.
00:59:15
Speaker
You must be a pretty, you know, high IQ society to allow something like this to go on. Like, I mean, it's genuinely, I mean, I'm being a bit snarky, but like, he's so weird. He's so erratic. And he like does so many like crazy things and like potentially dangerous things that like, but he has so much power and he's still allowed to have this power. It's like.
00:59:34
Speaker
You know, in like an authoritarian regime, this shit wouldn't fly. You know, if he was Chinese, that shit ain't going to fly. You know, they're not going to allow something like this to this. And so in a sense, I'm saying that this is the best we have in a stupid society is just like capitalist freedom. It's not perfect. It's very bad, actually. It's quite stupid. But it is the
01:00:00
Speaker
What's the alternative? The alternative is, okay, you do a non-profit, you have no money, you die, you start, game over. The alternative is you have a patron. First of all, assume you have a patron. Second, well, the patron probably got his money through capitalism, so he's using his weirdness, capitalism points, on you by proxy.
01:00:21
Speaker
And now you're also tied to him. So now you have the blow up risk of your billionaire buddy being, you know, now he has a sentence that the incentives of the billionaire expresses on you are also extremely powerful. And like, you know, maybe, you know, Dustin Moskowitz or whatever is a great guy, but like, you know, there's a lot of them who are not.
01:00:41
Speaker
What do you think of companies tying themselves to the mast by having windfall clauses, where, for example, if they successfully develop AGI, they then have some clause stating that they will distribute the profits from this venture in a, say, more fair way after they have returned money to investors?
Windfall Clauses and Profit Distribution
01:01:05
Speaker
Definitely a cute marketing stunt. But not substantially important.
01:01:11
Speaker
Look, let me put it very simply, right? Like, you know, so I run a company, right? I'm CEO, you know, and whatever, right? And sure, there's like, there are things you can do. You can have like a board, have like shareholder meetings. You do all these kinds of things. But look, if me and my co-founders were like, Hey, we're going to fuck up this company because nothing can stop you. Like it's like.
01:01:36
Speaker
If someone finds the Philosopher's Stone and he has a handgun, you're like, what are you going to do? You can like, you could complain about, oh, no, wait, you have a contract that you said you would give me the Philosopher's Stone and I'm like, okay, she's like, if, like, if, if the person with a Philosopher's Stone and a handgun is sufficiently on the line, you're just screwed. So like, sure, you can sign these contracts and maybe that makes people feel better, but.
01:02:03
Speaker
It's very funny because if you ask this question to someone from the Middle Ages or something, they would laugh in your face. They would be like, what? The king signed a contract with you? So what? Who's going to stop the king? And so I think this thing about power is that it depends on how the situation goes, of course. So turn down the cynical dial just a little bit from here.
01:02:31
Speaker
I do think people doing this are being for a large degree actually quite genuine. Not all of them. Some people, this is really just safety washing. Some of them are being quite genuine. They're really trying to make things work. And I respect them for that. I think it's really nice. So I think the biggest value of these things is more in just like signaling that they tried and like they're trying. But of course, any genuine signal of honesty will quickly be co-opted by anyone who's not honest.
01:02:59
Speaker
But it might be an interesting first start. It might be a milestone or a symbol where we say that this is what we intend to do. And perhaps then the actually trying to do that thing comes later on after you've signaled that this is what you want to do.
01:03:19
Speaker
I mean, sure, these are all coordination mechanisms of signaling. So I'm seeing from those perspective. Yeah, of course, it's valuable. I'm being very cynical right now because also podcast mode. But signals matter. Signals matter. Reputations matter. Honor matters. These things do matter. But we should not delude ourselves here that people lie.
01:03:48
Speaker
I'm sorry if this is shocking to any listeners, but people lie a lot all the time and they change their minds. A lot of people promise very big things when good times come, but once war comes suddenly you see how people really are. So I'm interested in these mechanisms from the perspective of
01:04:09
Speaker
what they say about the people. I know some of these organizations and I know some of the thought process that went into these things and I'm like, wow, you really tried. That's heartwarming. I actually feel better about you as a person. I trust you more now. There's other people where I'm like, oh, yeah, this is just a marketing center. I do not trust you more.
01:04:28
Speaker
So the interesting thing for me is what it says about the people, or what it signals about the people, and also what it doesn't signal about people. I'm not optimistic about the legal mechanisms. I would love if they would work. The problem is legal mechanisms like these involve having enforcement power.
01:04:50
Speaker
And if you have AGI, yeah, who's going to force that exactly? Mm-hmm, point taken. All right, perfect. Let's end it here and then perhaps move into the semi or pseudo lightning round, as I call it.