Introduction to Hosts and Guests
00:00:00
Speaker
Welcome to the Future of Life Institute podcast. My name is Gus Ducker. This is a special episode of the podcast featuring Nathan Lebens interviewing Jan Talon. Jan is the founder of Skype and a co-founder of the Future of Life Institute. Nathan is the founder of Waymark, which is an AI video creation company, and he's also the co-host of the Cognitive Revolution podcast.
00:00:25
Speaker
I really enjoy listening to that podcast because of Nathan's careful approach to discussing both the enormous upside and the potential risks of AI.
00:00:35
Speaker
If you're new to that podcast, I recommend starting with episode 15 on GPT-4 and economic transformation and episode 31 on the modes of top AI companies. And without further ado, here is Nathan Lebens interviewing Jan Talin.
Jan Talin's Career and Impact
00:00:53
Speaker
Our guest today is Jan Talin.
00:00:56
Speaker
Jan is a technologist, entrepreneur, and investor whose unique life journey has intersected with some of the most important social and technological events of our collective lifetime. Born in 1972 in then Soviet Estonia, Jan was 17 years old when the Berlin Wall fell, and he quickly became a video game entrepreneur.
00:01:17
Speaker
Years later, he created Kaza, the famous P2P file sharing platform that, at its peak, accounted for half of all internet traffic. From there, he went on to co-found Skype, which eventually sold to eBay in 2005 for $2.5 billion, and for years remained the most successful internet company founded outside of the United States.
00:01:43
Speaker
Circa 2009, Jan came across Eliezer Yudkowsky's AI risk writing, which he found extremely persuasive and which inspired him to dedicate his time, resources, and personal credibility to existential risk mitigation with a particular focus on AI.
AI Risk Mitigation Efforts
00:02:00
Speaker
Jan has since invested in nearly 180 startups, including dozens of AI application layer companies and some half dozen startup labs that focus on fundamental AI research. Those include DeepMind, Anthropic, and most recently, Conjecture. He's done all this in an effort to support the teams that he believes most likely to lead us to AI safety and to have a seat at the table at organizations that he worries might take on too much risk.
00:02:29
Speaker
He's also founded several philanthropic nonprofits, including the Future of Life Institute, which recently published the open letter calling for a six-month pause on the development of AI systems more powerful than GPT-4. With so much happening in AI right now, I decided to touch on Jan's personal story and to discuss Eliezer's baseline AI safety worldview only briefly in the first part of today's conversation.
00:02:55
Speaker
Instead, we focused on the current state of AI development and safety, including Jan's expectations for possible economic transformation, what catastrophic failure modes worry him most in the near term, how likely he believes next-generation systems like GPT-5 are to literally end the world, how big of a bullet we dodged with the training of GPT-4,
00:03:19
Speaker
whether in some sense we are lucky that language models are softer and slower than alternative AI paradigms, which organizations really matter for immediate term pause purposes, to what extent those organizations are currently coordinating or slowing down already, how AI race dynamics are likely to evolve over the next couple of years, what Jan and his team hoped to accomplish by calling for a six-month pause, and finally, how it's gone and how he's feeling about it all now.
00:03:49
Speaker
If nothing else, I hope this conversation makes it clear that the pausers are not merely Luddites who have never built and don't understand technology. On the contrary, Jan's personal achievements, world-class investment portfolio, and evident optimism for an AI-enabled future, should we manage to build one safely, show that at least some of our most sophisticated and accomplished thinkers take existential risks from AI extremely seriously.
00:04:19
Speaker
With that, I hope you enjoyed this conversation with Jan Tallin. Jan Tallin, welcome to the Cognitive Revolution. Thanks for having me. Really excited to have you. You have been a, I think, quiet but major player in the development of AI over the last 10 or so years now. And I want to give people just a very quick kind of overview of who you are and the role you've played. And then kind of jump to the future, which is the present.
Influence through AI Investments
00:04:47
Speaker
and talk about all the things that have happened in the last few months, as well as the call that you recently participated in putting out as part of the Future of Life Institute to call for this six-month pause in the development of large-scale models. So, a lot to cover. The world is moving faster than ever, it seems. But maybe just give us a little bit of an intro to yourself as
00:05:13
Speaker
an investor in AI companies. You can tell a little bit if you want about the story of how you came to be in position to invest in AI companies, but really super interested in how you have managed to become an investor in so many leading companies and the philosophy that supports that.
00:05:29
Speaker
So skip over the period of becoming an entrepreneur, running my own games company, then getting into development of peer-to-peer technology that culminated with Skype. And then at the end of my Skype career, stumbled upon Eliezer Rutkowski's writings and going like, holy hell.
00:05:50
Speaker
What is the world that I've been born into? And having a meeting with Eliezer almost exactly 14 years ago.
00:06:00
Speaker
where I tried to poke at his arguments, didn't find any holes, and then I thought, okay, how can I help? Sent them some money, but I think more importantly started taking those arguments, turning around, and presenting those arguments to people who would want to have some brand behind the person who is making the arguments.
00:06:27
Speaker
That's how basically the Cambridge Center for the Study of Extension Risk got started, where I convinced my co-founder Hugh Price there that these topics are important. Max Tegmark, I think he already was very prime to this argument, but that's how the Future of Life got started. Future of Life Institute.
00:06:46
Speaker
And the other strategy that I deployed was, okay, I already was a bit of an investor and I thought that perhaps I could use my brand to
00:06:59
Speaker
sort of kept our foot in the door in like various companies who are developing potentially dangerous things. So I did invest in a bunch of AI companies just to... I mean, I always had this dilemma of not wanting to directly accelerate them. So I tried to not be like majority investor or anything.
00:07:26
Speaker
but just enough to have a voice. With DeepMind, I actually had to walk up to Demis at a conference. That's how we started talking and eventually became friends. I still catch up with him every second time I'm in London or so.
00:07:44
Speaker
But once the world evolves, investor in DeepMind and eventually a board member getting the air of other AI companies became easier. So it's just worked my way up, so to speak.
00:08:00
Speaker
Yeah, I think a lot of VCs would be extremely envious of your deal flow. So I want to get back to that a little bit more in a second. But let's just go to the Eliezer moment for a second. You said this was 14 years ago. So this takes us back to circa 2009. At the time, the deep learning revolution hasn't even really started yet.
00:08:24
Speaker
It's a, you know, at that point, a kind of highly, well, this may not, you may object to this, but I would say for me, I read it as a highly speculative, yet very compelling thought about what might happen. And, you know, the arguments were, there was a lot of detail to be filled in where it was like, well, we have this, you know, insane amount of compute and like, we're probably gonna figure out how to use it. And then that probably goes very bad for us.
00:08:49
Speaker
So how did you understand what do you think is kind of the strongest version of that original argument? And then what have been the biggest changes to that worldview in the intervening time?
AI's Potential to Surpass Humans
00:09:00
Speaker
Yeah, I mean, there are many ways to frame things, frame the problem. Sometimes I've been kind of asking people to questions like, A, can you program, and B, do you have children? And then I get like four different kind of framings or approaches I can explain the situation with AI.
00:09:19
Speaker
One simple argument is that there is a reason why chimpanzees are not determining the future and haven't been determining the future for a long time, if ever. Humans are, but perhaps not for long because we are working furiously to get rid of that advantage that we have over
00:09:44
Speaker
like the apex species on this planet. Once you realize that AI will
00:09:56
Speaker
likely not stop at human level. There is this unfortunate narrative, especially that's very widespread in Asia, where a lot of people think that we are going to make AI smarter and smarter up to the point where it becomes conscious and then it's just like us.
00:10:17
Speaker
Then it's just like other people and we need to integrate them, give it voting rights and whatnot. Whereas I think this is just completely illusionary tale. It will probably not be conscious. It will just be very competent and competence and consciousness. They might be related somehow, but probably not. So we will have just
00:10:42
Speaker
control over the future yanked from our hands. So that's, I think is for me, compelling enough story. Yeah. So the linchpin there is.
00:10:53
Speaker
We're the boss of the world because we're the smartest thing around. And if we change that, there's a pretty good chance that we may not be the boss of the world anymore. And not only that, but we really, at this point, as things are starting to come online, we don't have a great understanding of what the new boss would look like or what it might want, or even how to conceptualize things like want in the context of its internal workings.
00:11:21
Speaker
Anything you would object to in my very brief extension, and then how has your mindset shifted also from the purely theoretical, largely purely theoretical 2009 Eliezer arguments versus today where we're in this world of
00:11:40
Speaker
Large language models, obviously, but also, you know, increasingly multimodal large language models and, you know, agent style systems like Agato, you know, that can do all sorts of things. How has the actual development of the technology changed how you think about it?
00:11:55
Speaker
Yeah, so many things to say about that. First of all, I think I just agree the way you phrased things. Sometimes I've been saying that, look, we are seeing the tail end possibly the last years of something like a hundred thousand year period during which humans were the boss on this planet.
00:12:16
Speaker
And it could be even more extreme. It's unclear if evolution will continue, if self-replicators will continue. Once you have AI, it's just completely taking the solar system down to the atom levels and rebuilding it and the rest of the universe. So it might even be like tail end of a 4 billion year period. So how my thinking has changed, yeah, it's
00:12:43
Speaker
There has been this abstract argument that if we just continue on this trend, we're accelerating towards a cliff. I think the current situation is that we seem to be starting to see the
00:12:59
Speaker
shape of the cliff through the fog. It's possible that it is still a mirage and false alarm and things will level out and we need some new paradigms. But the current situation seems more likely than not that this is it. When it comes to
00:13:22
Speaker
General trend, I think it has been like very unfortunate in AI research with some like silver linings. The unfortunate trend has been like we have gone from.
00:13:34
Speaker
of more transparent, more understandable paradigms to less and less understandable paradigms. We went from things like expert systems. They're like, by definition, they were super understandable. People were just interviewing experts and trying to hand-code the rules by which experts are making decisions into machine. That was like the 80s. It was a really big thing in the 80s.
00:13:58
Speaker
Then we went to supervised learning where people were just labeling data in different domains, trying to distinguish numbers. This is where deep learning started to shine first. Now we are in unsupervised learning. We don't even care much about what data we throw. We just throw a lot of data.
00:14:21
Speaker
at AI and ask it to just figure it out in what kind of universe you are, what kind of heuristics you should apply, what kind of skills you need to learn in order to predict the next token. I call it summon and tame paradigm, which is like you just use these multi-hundred million
00:14:44
Speaker
large experiments to summon an uncontrollable mind. Then you look at what it looks like and try to tame it. This kind of works if the mind is not very powerful, but it might not work for very long. Let's go back and just touch on the investment side for a second because I think this will help people understand the point of view that you have. It started with a series of blog posts in 2009, but now you're really quite the AI insider.
00:15:14
Speaker
You did a recent interview where you kind of ran through your investment portfolio in more detail, but I thought it was interesting how you split it into kind of two categories, one being like the fundamental AI research type company that you've invested in. I believe there's a half a dozen of those, and then there's kind of the application layer companies, and it sounds like there are
00:15:36
Speaker
dozens, maybe 50 plus of those. It seems like the big research companies would be the ones that would give you more insight into what's going on and what matters most right now, but maybe that's wrong. Could you just give a quick run through of some of the highlights of the portfolio and we can get a little sense from that of all the different angles that you have on AI
Economic Transformations by AI
00:16:01
Speaker
Yeah, I'm actually not the best person to talk about my investments because I have mostly delegated the way to a team of a few people. I still make the final decisions, but my focus really is philanthropy. When it comes to investments,
00:16:19
Speaker
Yeah, the fundamental research, AI research companies are specifically invested not to make money, but to have some kind of influence over what's happening inside those companies. They are in some ways adjacent to my philanthropy.
00:16:37
Speaker
And when it comes to applied AI, I think the prospects of applied AI companies are much worse now than they used to be before this large LLM paradigm. But of course, the LLM paradigm is very new. So there was no way to know that five, 10 years ago. But currently, I
00:17:01
Speaker
I think we have had this discussion with you that the big problem with applying, trying to build and apply the AI company using the LLM paradigm is that you have to be ready for the rug being pulled out from your next six months to a year work by the next generation of LLM, which is like a new crop of capabilities that have been bred. In a way,
00:17:29
Speaker
The more domain-specific is the AI competence, the more value there is in building application layers around this competence. Whereas, if you just get this increasingly generally competent minds, it's much harder to build applications in a stable way.
00:17:52
Speaker
Yeah, one of the interesting things about doing this show and talking to all the people that we have is, not to spoil one of our closing questions, but we often ask what AI products people are using today that they recommend to the audience. And I have been really amazed by how few different answers we've heard.
00:18:15
Speaker
probably two thirds of people have said, well, basically just use chat GPT. That's it. We get a couple other mentions, but it has led me to believe that the application layer faces some very serious challenges. It reminds me of other hyperscaling platforms that we've seen over the last couple of decades where you build around the edges of them, but the monopolist power is just so big. I do want to ask a little bit more about
00:18:46
Speaker
competing trends between centralization and decentralization, because I don't think it's obvious at all that it plays out as it did for Google and as it did for Facebook this time around.
00:18:55
Speaker
Let's just cover the flagship, maybe that's the wrong word, but the fundamental research company investments. DeepMind was the first. I know that you're also an investor in anthropic and have supported ought. I don't know if that's an investment or if that's just a donation. Conjecture is on that list.
00:19:17
Speaker
whom i'm missing on the list and i'm also really interested in terms of the conversations that you've had with founders as somebody you know just given your statement i'm sure you said the same to them like i'm not really doing this to make money i'm doing it because i want to have your ear in case something important comes up like how do people react to that do they say like yeah that's great i want you to be in that position to have my ear or people sort of like uh i don't know what to make of you yeah i mean maybe you're only investing in aligned people
00:19:42
Speaker
Yeah, I found it really, in general, my pitch as an investor to deep tech companies is that, look, I'm investing my own money. I don't have a boss. I have a sizable philanthropic operation. If I can do good by walking away from profits, I can do that in a way that VCs are at least
00:20:06
Speaker
For them, it's harder to do that because they manage other people's money. For them, it's in some ways, LPs are their bosses. A, I will be on the side of founders if they feel uneasy. I'm not going to push you to take this defense contract or whatnot. This usually goes down pretty well with founders because it's true.
00:20:35
Speaker
Who am I missing on the list? We got DeepMind, Anthropic, Conjecture, OTT. Who else would you put in that fundamentals bucket? OTT, I'm actually not an investor. I have sent some philanthropic money their way though. I mean, Vicarious was like a long time investment around the same time than DeepMind, then like a few other AGI groups that are not as well known like curious.ai, for example.
00:21:04
Speaker
There is this improbable.ai, if I forget, if I remember correctly in UK. Yeah, it's just like I have like 180 investments or something like that. So I don't quickly recall all the names from that. But yeah, conjecture. I think very highly of conjecture. In fact, whenever I go to London, I try to hang out in their office because they are
00:21:29
Speaker
They seem to be a group that has the highest respect for AI in a sense that this could be really dangerous and the danger is the important part here to focus on rather than whatever exciting commercial contracts we can squeeze out of it or something.
00:21:45
Speaker
So let's talk about that kind of emerging paradigm of danger. I mean, this has obviously been all over the discourse lately with the pause letter and Eliezer's timepiece. And I think broadly speaking,
00:21:59
Speaker
The public is extremely confused because on the one hand we have Eliezer and then on the other extreme, we still have people routinely saying like, this is all just hype and it'll never amount to anything, which seems crazy to me at this point, like almost like self-evidently, this is a big deal.
00:22:21
Speaker
But that is still out there. And for folks like my parents who don't rush to try chat GPT, they're just kind of hearing all these different messages from the media. And it's all just very confusing. So let's start with maybe the neutral or ideally even positive side. People are throwing around AGI all over the place.
00:22:44
Speaker
a lot of disagreement, or you know, probably mostly implicit disagreement on what does that even mean? Maybe we could just start with like, what do you think AI is going to do for us in daily life? Well, we'll then extend to like the dangers that it can pose. But what is your kind of expectation for how AI is going to impact our lives over the next few years?
00:23:06
Speaker
I think it's really dependent on how capable the planet is in constraining the large scale experiments. Because if it turns out that we can't constrain them and slow them down, then we're just going to die. That's my
00:23:24
Speaker
fairly confident prediction. If it can pause, yeah, then a lot of interesting questions come up because the GPT level crop of AI start with continuing proving, even if you don't do new generational experiments, breeding experiments, then even those could be super disruptive. For example, I
00:23:51
Speaker
I wouldn't want to be an art student in the year of like 2023 because like
00:23:59
Speaker
It's possible that the skills that you're learning are somehow can be pivoted into something that there will be societal demand for, but the answer could also be no, there won't be any demand for your skills. I personally see that extending to a great many domains. We just did a little episode on the possibilities for economic transformation. One of the things I'm trying to help people understand is I feel like right now we are in this
00:24:29
Speaker
kind of perfect little happy zone. You could call it like the Goldilocks time after the, I don't know if you know the Goldilocks story, but this feels like the level of AI power that is just right, perhaps.
00:24:44
Speaker
in that 90th percentile on the American bar exam, that's a really strong showing and that's base model GPT-4 capability. When you imagine what that can start to power when it is fine-tuned, when it is integrated with other systems, when it's
00:25:05
Speaker
able to take advantage of its ability, which we've seen demonstrated, to use tools. And that's not yet broadly deployed, but it certainly has been, I think, compellingly demonstrated. Then you add onto that an even bigger context window that very few have seen. And then on top of that, you've got the multimodal stuff like these
00:25:24
Speaker
The latest models will certainly be able to browse around the internet and understand websites and navigate and take actions online. It feels like that is enough on the positive side to create
00:25:42
Speaker
Transformation, really, economic transformation is kind of my baseline scenario at this point. And we're just at the beginning of the engineering phase of that, the deployment phase, the social figuring out of how it's all going to integrate. And it feels like that could be really amazing. And yet, at the same time, it seems like still pretty safe to say that it's
00:26:07
Speaker
limited enough in power that it won't become an out of control problem at this level.
00:26:13
Speaker
So I think that is one of the things that frustrates me most is when people who focus on AI risk also dismiss the power. Cause I'm like, you're, you're undermining your own message there. If you, if you dismiss what it can do, then nobody's going to worry about what you are worried about, you know, that it might do. Um, so let's, let's be very clear on like, just how capable the systems are. So I like your comment about conjecture, having like the highest respect, uh, for AI. I think that's something I try to cultivate in myself as well.
00:26:44
Speaker
Do you see that any differently from me? Does it feel to you like what we have is enough for economic transformation? Where do you think we are in that?
00:26:54
Speaker
I have a lot of confusion about how the economy works in the first place because I know that there are jobs whose main purpose is to make the boss feel more important. I don't think these jobs are very vulnerable to AI disruption because boss would be less important if that underlying would be replaced by AI.
00:27:19
Speaker
But I don't know how typical that kind of job is in human economy. And also like Teresa Katichelezer has pointed out that they just don't expect any changes from AI before we all die.
00:27:33
Speaker
because the rules and regulations in economy have constrained everything to that degree where it just can't have innovation that is going to leave a significant mark on the GDP or have big changes in construction or something like that. Perhaps it's wrong, but I have significant uncertainty about
00:28:01
Speaker
I definitely wouldn't be confident that we're going to get massive economic disruption from the current crop of AIs, but it's very plausible that we would, yes. Yeah. I think what you said about just how much time there is for the
00:28:17
Speaker
transformation to play out definitely makes sense to me. We're on, I've started counting time since the official release of GPT-4. So we're at four weeks and one day into the GPT-4 era as of today. And I do think it's really worth just kind of reminding ourselves and grounding ourselves in the fact that no previous system that the public had any access to
00:28:47
Speaker
could really do the sorts of high value tasks that GPT-4 can just do. And so we're literally, there's been a lot of growing awareness, there's been interesting use cases, there's been like copywriting assistance that have made a lot of money, but there was not an AI on the market until a month ago that had any plausible chance of like giving you quality legal advice or a quality medical advice.
00:29:15
Speaker
And now that is there. And again, we're just so early in starting to figure out how to use it. So it does seem like that takes a little while, kind of unavoidably. And I just want to remind the listening audience more so than you, that window has just opened. And we have no idea what's about to start coming through it economically, let alone in terms of
00:29:41
Speaker
alien AI overlords. So turning then to the kind of things that you worry about, I think this model of AI strength kind of proceeding through, you know, what appears to be a smooth loss curve, but
00:29:57
Speaker
What actually seems to be happening under the hood is like all these little thresholds of unlocking different discrete capabilities kind of being passed one by one and all of that kind of aggregating. Yes, I love that paper. All that kind of aggregating to a smooth curve, but actually being like all these little discrete bits. I think that's a really helpful frame.
00:30:19
Speaker
But I wanna ask you like, what are the things that you kind of most worry about? If you could try to make this somewhat vivid for people, what are the big thresholds that you're like, man, I don't know when, but an AI crosses that threshold and we're in real trouble. Like what are those and how does that play out in your
Human Oversight in AI Development
00:30:37
Speaker
It's possible that there are many such thresholds that we should be worried about. One neutral frame to describe what AI is, is that it's an automated decision-making machine that is A, non-human, and B, it is getting increasingly competent by day.
00:31:00
Speaker
As every leader knows, whenever you're delegating something, you're also giving up some control over the outcome. With that frame, there could be many domains where in order to remain in charge of what happens next, we should not delegate it to nonhumans.
00:31:19
Speaker
So we should have, as they call it, human in the loop. But the most obvious one that I can think of where we are already rushing to delegate things away is AI development. So once you have AI, LLMs that are able to develop, AI is better than any humans researchers can. Then basically, we have the most capable systems on this planet
00:31:45
Speaker
appearing without any human help and possibly without any human consultation and then basically good luck humanity. Yeah, that's interesting. I thought you were maybe going to say the deception threshold, which is one I hear thrown around most
00:32:08
Speaker
I mean, it's funny, it's striking for one thing that like, that's kind of open AI's explicit plan. You know, they're fairly high level, I would say, plan for AI alignment involves ultimately having AIs kind of supervise themselves and, you know, refine the dataset and hopefully bootstrap into something good.
00:32:27
Speaker
That has never really reassured me that much either. And Tropic also doing something. Yeah, constitutional AI, that is like specificity. Although this is like kind of AI is constraining AIs rather than AI is developing next generations of AIs. I think it's important to distinguish between those two frames.
00:32:48
Speaker
One big threshold is AI designing, training the next generation of AIs. Pretty, hopefully intuitive to see for people how that becomes potentially a runaway problem that we don't have great control over. The deception threshold, you know, kind of outer inner alignment mismatch seems like one that a lot of people worry about just as much. Any personal thoughts on that one that you want to share?
00:33:17
Speaker
Yeah, I think one thing that the alignment community has learned over the last decade is that the shape of the alignment problem has become much clearer. For example, indeed, this inner outer alignment dichotomy is something that, at least myself, I had no idea about, just this idea that
00:33:43
Speaker
deep learning paradigm and machine learning paradigm in general is training AIs by picking essentially random minds out from behavioral classes. You're not selecting AIs based on what they want. You're selecting AIs based on how they behave.
00:34:06
Speaker
and there could be many, many motivational structures behind giving certain particular behavior. The most scary one is basically, yeah, realizing that it is being trained and then just acting out the goal that you're training it for in order to be selected and eventually escape the box.
00:34:33
Speaker
Yeah, I think that one is hard to get around to just from the simple observation that we're not super reliable, you know, anybody who spent a significant amount of time trying to validate language model output, even just for like a, you know, relatively run of the mill application, like I've done this at Waymark, right, we're making
00:34:55
Speaker
marketing video content for small businesses. Really, all the stuff we create with language models, the main thing is write a script for a short commercial for a small business. It's a pretty narrow domain of space that we need to evaluate, and yet it remains a real challenge to figure out, is this model better than this one? We do a fine tune, how does it compare to the last fine tune? You're getting all these different outputs and
00:35:25
Speaker
It's just tough. The distributions are overlapping. The rate at which the new model is preferred to the previous one is often fairly low. I've seen published results as low as 11 to 9 ratio, where one is preferred to the other. Even GPT-4 to 3.5 is just
00:35:44
Speaker
seventy thirty in terms of preference like still a full third of the time people prefer three point five in the head to head comparison which kind of blows my mind given how like qualitatively better it seems gpt four is so that's just like the general problem of of validation but then you add into that mix that like we have all these you know here are six and biases that are exploitable we have these kind of you know cognitive uh gaps that have kind of
00:36:13
Speaker
lingered in our own systems. And evolution never had a real reason to eliminate all of them, or it never hasn't got around to it yet. And so we're exploitable. Everybody knows that in our daily life. We know that people at a minimum will tell us little white lies to make us feel good, or just to get through a situation a little bit easier. Do you see any promising route to avoiding that sort of
00:36:42
Speaker
exploitable evaluator problem? Short answer is no, but it's very much like open research question. On a theoretical level, indeed, you would want to somehow hitch a ride on the increasing capabilities of AI.
00:37:01
Speaker
when it comes to somehow making it more reliable or more constrained, more predictable in general. I hesitate to say more aligned because my model of Eliezrika was like, no, no, no, you don't point AI towards alignment. That's just a silly thing to do.
00:37:25
Speaker
AI is going to get more capable. Can we somehow get something out of it that is scalable rather than kind of ending in a predictably bad place? It might be worth just spending a little bit more time too on, again, just kind of how these things might play out. I think Eliezer has spoken very
00:37:50
Speaker
interestingly, componentally about what happens when you go outside of your distribution of training. And for humans, he just points out that basically everything in nature is optimized for reproduction, inclusive genetic fitness.
00:38:08
Speaker
And yet, the behavior that we observe in ourselves is not at all about in the modern environment does not appear to be about maximizing our reproduction.
Unpredictability in AI and Evolution
00:38:20
Speaker
And in fact, we didn't even know that that's what we had been optimized for until relatively
00:38:26
Speaker
recently. So we're out here kind of doing whatever we're doing. It took like a few random geniuses to figure out how we had actually kind of been created by nature and that has had relatively little impact on what anyone has actually done in their day to day lives. So would you add anything to that story or observation? Yeah, I mean, just to be more precise, I think we are selected.
00:38:55
Speaker
for kind of inclusive genetic fitness ability to reproduce. And again, because of the same problem that machine learning faces that we can only select based on behavior or based on results, that selection kind of effectively pulls in like a random instantiation
00:39:21
Speaker
of capabilities and motivations that just happen to give you this particular behavior without having any fine-grained control over what these motivations and capabilities actually are. So yes, evolution ended up pulling us
00:39:41
Speaker
selecting us in this ancestral environment where we had developed a bunch of heuristics that were very useful for reproduction in that ancestral environment, but much less so in the modern environment without never
00:39:57
Speaker
actually going to be ingraining in us any fundamental understanding what we're being selected for. The very same process might just replay it when it comes. Very same process might get replayed as we are selecting AIs based on behavior and without any insight on the inner workings of them.
00:40:23
Speaker
So we could spend hours unpacking all this. I know you have done that many times. So we will bracket that for the moment. We've got all these different failure modes. We've got potentially runaway AI training its own successors in a way that is not clear to us. We've got the deception problem. We've got the fact that
00:40:46
Speaker
We have no reassurance or no reason to believe really at all that the goals that we have for AI will be like represented internally. And so, you know, with
00:41:01
Speaker
a sudden jump in kind of the domain in which the AI can operate, it can be totally outside of training distribution and who knows how it might act, just like who would have expected how humans might have acted from the ancestral environment. So all these things are pretty big conceptual problems. We don't have good answers to them at the moment. What do you think that kind of, how does that boil down to
00:41:29
Speaker
a simple worldview for you. What are the odds that you see right now of serious catastrophe happening in, say, the next two, five, 10 years? Maybe we could segment that into, given the trajectory that we're on, versus how we might be able to shift that if, for example, we took a pause.
00:41:53
Speaker
My current estimate for life-ending disaster is basically 1-50% per generation, per 10xing of compute that's being thrown at these experiments.
00:42:10
Speaker
Currently, there are 10X-ing things in six to 18 months window, so you can calculate from there. At some point, we're going to run out of compute because there's only so many 10X-ing you can do. We probably can't do thousands of those, but still,
00:42:34
Speaker
Uh, let's say something like a geometric mean of one and 50% is seven percent. So it's seven percent risk to everything. Uh, like, uh, if we, if we continue doing those, we probably can still do like something like five or six of those. And at that point we are like more likely dead than not worth taking a second to just let that sink in. Would you have put that same estimate on GPT four? Like, do you think we just survived a 7%?
00:43:05
Speaker
x-risk event with the training of GPT-4? That is a great question. With hindsight, I'm super anchored now. I really want to say no. But again, 7% is this point estimate. Really, my uncertain range is from 1% to 50%.
00:43:28
Speaker
An interesting question is, would I put less than 1% confidence in GPT not destroying everything, GPT-4 not destroying everything? Probably not. I think it's unreasonable to have at least, given the things that I knew,
00:43:48
Speaker
with GPT-3 and everything else and things that I didn't know, like having a very close look at what's happening in GPT-4 training. Then yeah, I think it would have been unreasonable for me to be confident in less than 1% doom from GPT-4.
00:44:10
Speaker
Honestly, I can't really argue with you there. When I look at the, you know, when I got my first look at it, it had already finished pre-training and initial reinforcement learning. And this was, you know, the six months ago when they kind of finished the first version before any of the safety work. And obviously there was the, you know, the whole red teaming effort and everything else. It definitely hit me pretty hard that like, wow, this is a significant leap. And now you look at all the papers that have kind of come out characterizing it.
00:44:41
Speaker
in the wake of the official release, the thing that I kind of keep coming back to is we have these smooth curves, but then you have on individual behaviors, you have these sudden jumps. So the one they published in the technical report, which isn't such a big deal obviously, but maybe indicative of things that could happen in the future on more problematic dimensions is the hindsight bias.
00:45:06
Speaker
where it had previously been observed, I think by anthropic, that bigger models suffered more from hindsight bias. And so it was an example of an inverse scaling law where the behavior is getting worse with bigger models.
00:45:20
Speaker
And then all of a sudden, with GPT-4, that problem is totally fixed. And there is no hindsight bias. And it basically just scores 100% perfection on those hindsight bias problems, which, by the way, are basically scenarios where you had a good bet available to you, you took the good bet, and you lost in an unlikely way. And so the question then is, should you have taken the bet?
00:45:46
Speaker
You know, people might say in the hindsight bias, we will know if I lost and I shouldn't have done it when in reality, like you actually had, you know, all good reasons to do it. So three point five was actually getting this wrong more often than like three and more often than than some smaller models. But then again, boom, some somehow some unlock has happened in the course of training. And it was probably never registered on the smooth loss curve, which mostly looks smooth. But
00:46:17
Speaker
of a sudden this behavior now is like totally strong, I would say, you know, probably safe to say superhuman in the sense that obviously we create these measures because some of us struggle with the hindsight bias. And so yeah, you wonder, okay, if that's those kinds of things, we do see those kind of sudden jumps in capability in the context of GBT4, like
00:46:40
Speaker
you know, another 10, 50, 100X compute scale up. You know, it's predictable that it will bring more of them, but it's very unclear like what exactly they would be. So
00:46:57
Speaker
one to 50% across these big scale up training runs. How do you think that plays out across different groups that might be running those processes? I don't know if you may or may not wanna go to kind of specific names, but like obviously there's a few leading groups that can plausibly scale up another one or two orders of magnitude right now.
00:47:23
Speaker
Do you think that's equally reckless for any of them to do, or do you think some have a better handle on how to do that responsibly than others? I mean, there are always some differences in various dimensions. So yeah, I mean, just like hanging out in an entropic feels materially different than hanging out in DeepMind, where I both hang out at both places and a little bit at OpenAI.
00:47:50
Speaker
There's certainly a much more safety culture in Antropic. Does that justify risking everything, killing everyone? It's like, I don't think so. In some ways, these are like second order. In my view, they're like second order effects. How safety-conscious your group is compared to the fact that you're taking just massive risks with everyone's lives right now.
00:48:18
Speaker
So how do you think about you've been you mentioned two companies that I have serious questions about. I guess let's go deep mine first. I've been kind of waiting for like a gato to to drop and it seems like you know as I check my my imaginary watch like it seems like that is probably due right around now unless there's
00:48:42
Speaker
some sort of pause or like somebody's kind of thought better of doing a gato two or you know maybe it just didn't work for some reason but that seems unlikely because it seems like almost everything is you know quote unquote working these days.
00:48:56
Speaker
Do you have a sense for what is going on at DeepMind? Demis published a Time article. It feels like a long time ago and much more reserved and moderated tone for a Time article than Eliezer's more recent one, but still pretty striking to see founder or CEO of DeepMind saying, we need to think about slowing down. Are they slowing down?
00:49:19
Speaker
I don't know. I don't have that much visibility into DeepMind. I have heard about them deliberately being more cautious about publishing things, which is our empirical thing that I haven't verified. Is that actually true? But it feels that they are more careful now when it comes to publishing.
00:49:42
Speaker
In some ways, we are kind of in a lucky world that all the big free labs, they are safety conscious to the level, at least to the level of not dismissing the risks in a way that, for example, Yann LeCun or Andrew Ng are just completely dismissing the risks. It's not obvious that the world should be in a way that, for example, I've been
00:50:09
Speaker
praising Sam Altman for saying the things that he says about the risks, and he's been very explicit about the massive dangers that humanity is facing from AI. Another question is, to what degree does this safety
00:50:31
Speaker
consciousness actually constrain the actions of these companies that have their own incentives as non-human optimization engines and necessarily some hard to
00:50:48
Speaker
hard to lead. The leaders of AI companies, they have a bunch of conflicting requirements that they want to satisfy, especially in DeepMind, where one big constraint is that they're not a company. They're a subsidiary of Google.
00:51:05
Speaker
In some ways, I'm sympathetic to them trying to navigate that must be complex set of constraints. I think you're right. I found myself saying this too. It is easy to imagine people that are a lot more cavalier running the frontier projects. I'm thankful that there does seem to be a profound
00:51:35
Speaker
awareness and real seriousness of approach across the biggest companies. In some ways, I also feel like we might be in a lucky scenario in that language models are taking off, and yet they're very soft-edged AI. They also run slow.
00:52:00
Speaker
And I contrast that to what, you know, I think maybe Eliezer sort of had in mind 14 years ago or what DeepMind was, you know, seemingly like closest to, if I had to say, you know, five years ago who was closest to AJ, I would have said DeepMind with all of their like, you know, game playing, you know, agent, learning agents, all that kind of stuff.
AI Models: Language vs. Game-playing
00:52:21
Speaker
Those notably like,
00:52:24
Speaker
achieve like alpha zero and all that, right? They notably achieve like dramatically superhuman performance in obviously like narrow domains. They also run really fast and they're trained if anything in like an even more
00:52:39
Speaker
alien way where Alpha Zero just plays itself right in all these games and kind of learns from that and doesn't even need to see the database of human games and therefore when it shows up with superhuman skill it's also like kind of an alien superhuman skill and you get these like dramatically surprising moves that you know no human would ever have made. In contrast I feel like language models you know everything has pros and cons right they certainly have insane surface area
00:53:10
Speaker
But their kind of softness and slowness does seem like it might be a real advantage relative to a more kind of hardened, faster agent type of model. How do you think about that? Do you think we are lucky with LLMs or am I just naive in my optimism?
00:53:29
Speaker
Yeah, I think like the big trend has been negative in terms of like going towards more and more black boxy and kind of uncontrollable training regimes. Like going from like expert systems to supervised to unsupervised learning. On the other hand,
00:53:44
Speaker
Yeah, there are a few things definitely that we can sort of got lucky with. I would say the prime one is the fact that you actually do need a lot of compute to do the pre-training of large language models, which means that there are only a small number of organizations on the planet who can do that. And those training runs are potentially very
00:54:10
Speaker
conspicuous. Only half joking will say that the planet is now breeding alien minds in a way that aliens can see because very plausibly you can see those energy expenditures from space.
00:54:26
Speaker
So that's one lucky thing about telelamps. And the other thing, yeah, I agree that the speed at which, or the slowness rather, at which taken off process things is an advantage. But this is a temporary advantage, I'm pretty sure, because human brains themselves offer a proof of concept that no, you don't have to be that slow. This is just pure inefficiency.
00:54:53
Speaker
The other thing is, once you have some feedback process where the LLMs will start developing AIs, those AIs might no longer be LLMs. What do you make of this current moment? This is something that has really just popped up and gone widespread.
00:55:12
Speaker
In just the last two weeks, but there's all these kinds of projects to create, you know, some, one of them is called baby AGI. Another is called auto GPT. And essentially they're taking a language model, putting it in a loop and kind of giving it, you know,
00:55:29
Speaker
the ability to have a goal, delegate to itself, go through these thinking, reasoning, planning steps, then start to use tools. And again, getting around the hard limit of a context window through some sort of self-delegation. I'm struck by that as potentially the next convergence between those two paradigms in some ways. And it also seems to open up the potential for
00:55:59
Speaker
a kind of self-play reinforcement learning. These agents are not very good right now. And so if you go on Twitter, you'll see people being like, this is so amazing. Look what this thing can do. And then you'll see other people being like, it fails way too much. These are not useful. It's going to be a long time before they are useful. But I kind of think those people are wrong in the sense that
00:56:18
Speaker
This is the first language model paradigm that feels like relatively easy to evaluate in a fairly open domain because you can kind of know like did the thing, you know, book you the flight or whatever, right? Or did it just get hung up on some API error that it never solved? And it seems like they're going to learn pretty quick from this like massive little agent ization and, you know, kind of exploring a paradigm that's just been
00:56:48
Speaker
set up. How worrying of a development is that for you? There are several frames to look at this thing and these frames will give you almost a very opposing judgment about the situation. For example, one very positive frame to look at this is that it's great that society is poking
00:57:11
Speaker
rattling and poking these current models to see what are the extremes that you can push them to because they are not very competent. By having those experiments with chaos GPT and whatnot, we as a civilization will actually learn how bad things could be if things would be scaled up. If you take chaos GPT and put GPT-6,
00:57:39
Speaker
I claim you might not be safe at all anymore. On the other hand, you can take this frame that when, I don't know, people like Yann LeCun, et cetera, have been saying, there's nothing to worry about from AI because it's not going to be agent. Even if it's going to be agent, it would be just stupid to install some self-preservation and just
00:58:03
Speaker
bad goals. We're not going to do that. No, we absolutely are. A fraction of humanity has a death wish. That is like a clear empirical demonstrations that if something really bad can be done with AI, it will be done. Yeah, it's a big world and it's pretty easy. That's the other thing that's amazing.
00:58:26
Speaker
I think the first commit of the baby AGI project, which I believe has been the number one trending project on GitHub over the last couple of weeks alongside like a couple other very similar projects. The first commit I think was 105 lines of code.
00:58:42
Speaker
And that's all it takes. A couple clever prompts and kind of a loop. And you've got yourself a little agent. And it might not do much yet. But given the model, the barrier to creating some sort of semi-embodied, autonomous version of that is proving to be extremely low.
00:59:04
Speaker
So yeah, I don't think we're gonna, we will not be able to rely on the good discretion of users in the longterm. Certainly probably not more than for a few days with the release of any major new system. You mentioned a minute ago, you said three kind of leading groups. And I wanted to ask you how you think about like who is at the frontier and who is maybe going to,
00:59:30
Speaker
be at the frontier over the next few years. I assume the three you had in mind, you didn't specifically say OpenAI, but obviously they're in that group. DeepMind was the other that we were discussing, and then I am guessing you're thinking Anthropic would be the third. I think DeepMind and Google, they could be interchangeable. I hear that they are even publicly mentioned that they are somehow joining forces when it comes to this LLM race.
00:59:55
Speaker
So nobody else you feel like is close enough at this point. Like if it's a coordination problem of who actually, you know, who are you calling on to pause? I mean, you're calling on everyone to pause, but it sounds like it's really those three organizations that you're calling on for a pause.
01:00:10
Speaker
I think they are the first so-called first tier when it comes to doing the most dangerous experiments. But of course, then you have the second tier. I think Eliezer has this related law, the Morse law of math science that I forget exactly how it was framed. Every two years, the IQ needed to destroy the world drops by one or two points.
International AI Race Pressures
01:00:40
Speaker
As the hardware companies, mostly Nvidia at this point, will throw in more and more capable and cheap computing cycles at the market, the world's destroying capability will be in a larger number of hands.
01:01:00
Speaker
Interesting to think about how that evolves over the next few years. Do you think if you imagine that, let's say we do enact a pause and then, meanwhile, Nvidia keeps shipping and people keep kind of doing fundamental, which notably the letter
01:01:18
Speaker
explicitly goes out of its way to say, like we're not saying all AI research should stop or that, you know, you can't build your small models or fine tune things for your use cases and so on. So if we imagine a world where there is kind of a pause on these high-end experiments, but hardware continues to ship and generally speaking, like the field is not shut down. Do you have a guess for how many folks would be in the kind of
01:01:46
Speaker
You know, would be able to do a GPT five type project if they chose to in say five years.
01:01:54
Speaker
Five years is a super long time. Yeah. I'm with you on that, by the way. I don't even try to guess things five years out, so I shouldn't ask. Yeah. Two years. I mean, yeah, probably a dozen is something that kind of just pull out of thin air. If I would think about it, then I would probably have like, I mean, it's probably less than 100, more than 10, perhaps closer to 10 than 10 to 100 is my answer in two years. And so some of those we can kind of fill in
01:02:21
Speaker
pretty obviously right like meta seems like it would be a very natural candidate um microsoft i mean they have the open ai partnership but they certainly also have their own big research division presumably apple you know has the resources to get into that game maybe even like a tesla i mean they're focused on other things at the moment seemingly but they also have the bot you know the tesla bot is going to need uh some sort of
01:02:48
Speaker
you know, fairly general intelligence to to help it walk around and talk to people and pick stuff up. Any other kind of specific actors that you think would be likely entrance there? And then how do you think about kind of the international scene? Like, is there anything coming out of Europe? And obviously, then everybody starts to think about China, too. Yeah, I was going to say that like the obvious, like in two year perspective, yeah, you should kind of like start also like
01:03:15
Speaker
looking at non-US actors in Europe where it's going to compute this much more available and then also China. I think they're like one sort of
01:03:30
Speaker
obvious counter-argument that we're getting with when it comes to calling for the pause is that, what about China? My answer there is that currently China does not seem to be in the race, at least not as intensely
01:03:49
Speaker
as the leading US labs are in a race between themselves. Second, almost culturally, Chinese seem to be much less keen on pulling a bing and just unleashing uncontrollable mind on their territory.
01:04:10
Speaker
I mean, only half-choking you say that, like in China, if you do that, as a tech CEO, that might get you disappeared. But yeah, in the longer run of like two or more years, it will become more and more important to get some kind of international agreements that we already have in nuclear, for example, in place also for compute.
01:04:35
Speaker
On the China front, I totally agree. I don't know why we would assume, given the posture that the Chinese government has taken toward technology over the last few years, that this is the technology that they're just going to throw caution to the wind on. They've functionally shut down their entire video game industry.
01:04:54
Speaker
and limited it to, as I understand it, just a few hours on a couple of weekend nights per week is all that video game companies I think can even operate in China now. We should fact check me on that, but that's what I understand to be the case. Online learning, I understand, is also being mostly. That one is fascinating too because who could object to online learning, but my sense is that they intervene there
01:05:22
Speaker
on a state level because they feel like there is an unhealthy market dynamic developing where people are working too hard on these whatever standardized test measures and putting way too much resources in this and it's gone past the point of societal benefit.
01:05:40
Speaker
Yeah, I'm actually a chairman of an online language teaching company, an Estonian company that has a big linguist. Oh yeah, you know it, of course. One thing that they learned in the Japanese market is that there's a massive
01:05:57
Speaker
English teaching market in Japan, but people don't care about learning the language. They just want to pass the tests in order to get better employment options. As a language teaching company in Japan, your job is not to teach the language. Your job is to get people pass good test scores, which is a very different job. I suspect that something like that was happening in China.
01:06:23
Speaker
I would not take personally, I'm not ready to say by any means that I want to fully subscribe to Xi Jinping thought or live under the technology regime there. But it does seem like we're jumping to a conclusion way too quickly when we say like, well, if we don't do it, they will and that'll be worse. But leaving that aside for a second, just coming back to ourselves.
01:06:48
Speaker
The other big thing that has come out this last week or so is apparent and I take that it's probably legitimate a sort of leaked fundraising document from anthropic that says that they are planning to raise five or so billion dollars and
01:07:10
Speaker
kind of see the next two, three years as super critical, planning to do next gen models. They're kind of immediately moving into a GPT-5 type scaling regime, it sounds like. The model itself supposedly is going to cost a billion dollars to train. And then they say, again, this is all according to reporting. I haven't seen the deck.
01:07:33
Speaker
But those that fall behind in the twenty five twenty six time frame may never be able to catch up. So when I heard that, I was like, man, that does not sound like a company that's about to pause. It doesn't sound like. I mean, it does sound like a company that's kind of in the race now. How are you viewing that news? And I don't know if you have any, you know, inside view. I don't obviously wouldn't ask you to, you know, to share anything that you shouldn't. But
01:08:04
Speaker
What should the public make of that, that, you know, update from supposedly like the most safety centric leading lab that there is? Yeah. I mean, like to that, to that degree that this thing was accurate, which again, I can't comment on, uh, because I'm, uh, investor in anthropic and board of board observer as well. I think to the degree that it was accurate, it is.
01:08:31
Speaker
strong evidence that there is a massive race happening between the US companies. That is going to get us killed. Can we stop that race, please? What exactly is the thought process knowing that this group came from OpenAI, the high-level description that I've heard is they felt like it was becoming too commercial. They wanted to be more focused on a safety first type of approach. That's been two years or whatever.
01:09:00
Speaker
Now this, like what, the only thing I can come up with is people must be thinking we'll do a better job than they will do. So therefore we should do it before they do it because if we don't, then they'll do it or they'll do it worse. Um, and it seems like maybe everybody's thinking that I kind of model like open AI to some degree that way as well. Is that how you think about the decision makers or what do you think they're thinking?
01:09:27
Speaker
So I think like one very informative public piece of information is like Future of Life Institute podcast with Dario and Daniella Amade that was like during COVID about a year ago, like, yeah, in early, early 22.
01:09:43
Speaker
Basically, they explain what the approach of Entropic is. As far as I know, it is true. What I said in this podcast is basically train the frontier models and then basically do alignment in an empirical fashion while having access to frontier models. The claim is that it is exactly because of these emergent capabilities
01:10:10
Speaker
There is only so much you can do using not-state-of-the-art models because, in some ways,
01:10:22
Speaker
As the models get more competent, they also become easier to interact with. The fact that we have language models in the first place, in some ways, I think you pointed it out, can also serve as a good interface when it comes to alignment. That is kind of anthropics.
01:10:44
Speaker
orthotropic thesis to have these latest models and then basically use those to do state-of-the-art alignment that is empirically tied to the actual objects that we have. Now, of course, the big question there is how many generations you can do that for? Because the pre-training is largely uncontrolled, unsupervised process. How many
01:11:14
Speaker
generations, we can do the pre-training safely, not to mention things like leaks to elsewhere of the resulting weights. I think it's a genuine dilemma. In some ways, I think the anthropics framing and perspective makes a lot of sense because indeed, you can get more useful alignment work done with the latest.
01:11:44
Speaker
crop of the large-language models. On the other hand, each training imposes some risk, again, in my estimate 1% to 50% risk of complete annihilation of the planet. How do you navigate that trade-off? It's not obvious, and I don't think there's enough thinking into that trade-off. Again, it feels like there's some sort of game theory element here where it's like,
01:12:10
Speaker
It seems like they're doing it because they believe someone else is doing it still on some level, right? If they roughly share your view and they're like, well, the only way we can do the alignment work is if we have access to the latest models, then a good counter-argument would be, if it were true, well, nobody else is going to create these if you don't. So maybe you should just sit tight too, and then we can all kind of study what we have. We don't need another frontier just yet. So it still seems like it is fair to say there that
01:12:40
Speaker
On some level, they feel like their hand is forced, like they can't not do it because they're either in the game or they're out of the game, but the game will continue regardless. It seems to be the model that's kind of implicit in that decision making.
01:12:52
Speaker
Yeah, there are a few models that are consistent with evidence, but that definitely is one model because of these race pressures. They're going to feel that if they don't have access to the latest generation of models, their prospects of actually doing alignment are significantly hampered. So that's the positive way of framing things.
01:13:16
Speaker
The letter, even I think explicitly mentioned that one goal of this letter is to call for this timeout in this race that indeed has a non-trivial k-enteritic element.
01:13:29
Speaker
I think your most recent fundamental AI research investment is in conjecture. And if I understand correctly, and I may be wrong on this, but I don't get the sense that they are planning to try to train a GPT-5 in the short term. How are you thinking about their contribution, their strategy? It seems like they take a different approach where they don't feel like they have to create the frontier models in order to do something useful.
01:13:54
Speaker
Yeah, I am much more positive about Conjecture's approach in terms of safety capabilities trade-off. They still train language models, but they do not do latest language models. Their goal is to compose less capable language models in a way that makes for a more predictable structure, so you don't risk the world during the pre-training phase.
01:14:22
Speaker
and have like a more, in some ways, kind of more old school approach to AI rather than this someone and team approach. Yeah, interesting. We just did an episode with Andreas and Jeongwon from Aut and they have a very similar kind of outlook too.
01:14:42
Speaker
composition of models, the traceability of all the logic, atomization of the different decisions and operations to try to create some sense of designed control into the system from the beginning. It sounds like conjecture is on a similar line of thinking.
01:15:04
Speaker
Yep. And I think that if we manage to get the pause in this like game theoretic race, assuming it is a game theoretic race, it does also like some frame that says that no, this is like just apocalyptic test cards trying to end the world. This is like the least, least charitable frame. But if it could get a pause, then like, I do think that like there's like this,
01:15:31
Speaker
almost automatic pressure to get more competence out of the minds that we already have trained. Part of it is just to have a better understanding, a better composition of the capabilities that we already have.
01:15:46
Speaker
Because one important bit, as a lot of your listeners probably know, is that you have this training phase that is much, much more expensive than actually the inference phase. So once you have spent a lot of compute on training, once you finish the training, you have a lot of ability to run many, many instances of the minds that you just trained.
01:16:11
Speaker
So let's talk about the letter. So you guys, obviously, we've covered a decade of your thinking and investment on this subject. And now we get to the point where, OK, GPT-4 is released. It is a closing in on human expert performance in a great many domains. It does seem to me like it's quite
01:16:34
Speaker
unclear what the next generation of that would bring. Obviously, you guys are thinking something very similar there. So how did this project come about for a letter? How did you guys settle on a six-month pause? What was the process like of trying to bring a broad coalition together? And was this something that you guys actually thought might happen? Or is it intended to be a conversation starter? How do you think about this project?
01:17:04
Speaker
Good questions. I remember we had the FLI catch-up call on
01:17:12
Speaker
on March 21st, so I mean, less than a month ago. And we had already observed that there are significant voices in the public. I mean, the Kesla Klein's article in New York Times had come out where he kind of
01:17:38
Speaker
explicitly compared the current AI race to summoning minds. And I think Harari's article also was published around that time, where he was concerned about LLMs plugging into the operating system of civilization, which is about language and which operates on the language level.
01:18:04
Speaker
Uh, so, uh, and then like numerous discussions, uh, sort of private discussions that are very concerned, uh, about, uh, about the current race, including with people in the labs themselves. Uh, so we thought that, okay, perhaps like one valuable thing that I felt like could do, uh, is to try to create some kind of common knowledge that yes, like a bunch of people are worried and now basically.
01:18:31
Speaker
create a situation where those people know that other people are also worried, and the other people know that the other people know, et cetera. And we thought, OK, we have some experience with open letters. So perhaps we should try to draft one up. And of course, our previous open letters had something like 1,000 or less than 1,000 signatures. So one thing that we could just completely
01:18:57
Speaker
blown away by was just a reaction. It cannot be immediate in the first few days. We got tens of thousands of signatures, and we had technical problems because of that. When it came to drafting the letter, there were multiple considerations. One consideration was just speed. Clearly, if we would have had several weeks to work on it,
01:19:26
Speaker
then it would have been much better. But it was done much in the spirit of let's not have the perfect to be the enemy of the good, and let's draft up something that feels good.
01:19:42
Speaker
okay to put out. Indeed, the six-month number was one thing that was put in and then taken back out of from different versions of the draft.
01:19:58
Speaker
argument. One question we get is why six months? What can you do with a six month pause? What can you demonstrate with a six month pause bias? One important thing that people don't necessarily realize that six month pause would bias is confidence that we can pause.
01:20:23
Speaker
And in that sense, it's better to have a proposal that calls for six month pause to fail than a proposal that calls for indefinite pause to fail. Because the indefinite pause situation, people go, oh yeah, if it would have been six months, of course we could have done it. But because indefinite, nobody would pause indefinitely, right? So that was the final reason that we thought, okay, let's put in six months and see what happens.
01:20:51
Speaker
Yeah, I thought Zvi had some great analysis of this in the first. Somebody who also works with a speed premium that I appreciate had some just I thought pretty ultimately simple, but still very wise analysis that's just like if you feel like you're going to need coordination in the future, it makes sense to start building it now.
Calls for AI Development Pause
01:21:14
Speaker
And if you can only get a little bit going at first, then you kind of take what you can get and you hope to build on that foundation.
01:21:21
Speaker
The other really big consideration that can fade into the open letter is that there is a lot of reasonable discussion that happens in the labs and even between the racing labs about the need to coordinate, the need to pause, need to be careful, and people have been public about it, et cetera. But there's always one missing component. The component is when.
01:21:49
Speaker
They always say, I think Scott Alexander wrote a good article about open AIs, like AGI and beyond statement, where he also pointed out that, yeah, they're saying a lot of nice things, which is great. I mean, honestly, it's good. But they don't say when. And this should make a little bit suspicious, at least. And so therefore, one of the rationales for the letter is that
01:22:15
Speaker
How about now? So what have you, I want to go through some additional kind of questions that I've seen floating around in the discourse, but what is your sense of kind of how the reaction has been? Obviously there's been a ton of signatures, even one notable CEO of, you know, a company that I would say is like close to a leading lab, which is a mod from stability signed onto it. I thought that was fascinating.
01:22:42
Speaker
As far as I know, it's been pretty quiet reaction from the kind of three main groups that are the ones who are going to either pause or not pause for something beyond GPT-4. So what do you make of the reaction from them? There's been no public statements as far as I know. And has there been any kind of private or sort of confidential reassurance? I mean, has there been any reaction from the leading labs?
01:23:13
Speaker
Yeah, I think Sam Altman said publicly that he's with the spirit of the letter, appreciates the spirit of the letter or something, but then there was of course a butt. I don't even remember what the butt exactly was.
01:23:29
Speaker
But yeah, a tick mind definitely hasn't said anything, nor has entropic, although I just today read Chuck Clark's newsletter, Import AI, where he mentions the letter and also says why he didn't mention it last week. So the story tastes like some reactions, but
01:23:54
Speaker
I also want to be careful here in the sense that I don't want to create some self-fulfilling prophecies. I would say that the possibilities are definitely very much open at this stage for the letter somehow catalyzing an actual pause, but it's like double digit uncertainties both ways. Cool. That's actually more positive of a response, albeit minimal than
01:24:23
Speaker
than I had expected or even understood. I haven't seen that Sam Altman tweet. I'll have to dig into that. Yeah, I don't think it was a tweet. I hope I'm not just like I didn't see it in my dreams or something, but I think it was actually a comment in some news article about the letter.
01:24:41
Speaker
humans too suffer from hallucination at times, so we'll fact check ourselves. Indeed we do. So okay, so you guys put this out there, now people start to say all kinds of stuff, right? The thing that I think is kind of most, like you didn't finish reading the letter, but which is definitely worth giving you a chance to kind of respond to is, although all the letter says is a pause, you know, it doesn't ask for anything else, so it's stupid because like what are we supposed to do?
01:25:07
Speaker
Again, like you can read the letter for yourself. There are definitely some things that it calls for, but just kind of obviously it's a big tent, you know, committee sort of document just zeroing in on your own priorities. What do you think are like the most important, tangible, concrete things that we could do over, say, a six month time frame such that maybe even you would be comfortable, you know, ending the pause like it do.
01:25:36
Speaker
I guess top priorities. And if those top priorities happened, would that be enough in your mind to then end the pause on like traded GPT five? So, so one thing that will sort of like happen automatically is that we will get more experience with a crazy situation now that we're in, uh, which is that we have, uh, I think, I think Joshua Benja put it, or Joshua Benja put it, uh, that now we have.
01:26:00
Speaker
AI is out there in the internet that can pass a Turing test. That is a novel situation. I think we would be much smarter six months from now than we are now because six months will be a long period, long chunk of time when it comes to living with aliens on your planet. I don't know what we're going to learn, but hopefully we just are smarter in the autumn than we are in the spring.
01:26:30
Speaker
But when it comes to more concrete things, I mean, Neil Nanda, a researcher in the UK who has worked with DeepMind as well as Entropic, has this blog post called 200 Open Problems in Mechanistic Interpreterability. There's so much work that can be done with even the previous generation of models.
01:26:56
Speaker
not to say anything about the latest generation. There's so much alignment work and just opening up those black boxes and trying to understand what makes them tick, how can we
01:27:10
Speaker
How can we get any guarantees about what the next generation is going to do? So again, with research and with lived experience, I hope we will be in a much better, I mean, I hope we will be in a much better, but realistically, I just say that we're going to be in a better place six months from now. That doesn't sound to me like you would expect just on general kind of improvement to get to a point where you would then say,
01:27:40
Speaker
Okay, let's end the pause and do the next generation. Is there anything? I mean, correct me if I'm wrong on that, but maybe framing the question slightly differently. If we abstracted away from the six month timeframe and said like, are there concrete structures of some sort that we could put in place that would on any timeline, you know, kind of give you
01:28:05
Speaker
enough confidence or reassurance that again, you can say like, okay, it seems like we're now in a decent enough place that you would personally be comfortable going ahead with a next generation training.
01:28:15
Speaker
It's still like the big question is like how much risk are we okay taking? I'm not saying this risk should be zero because there is always like this background risk of extension risk. We could be hit by asteroid. There's a certain probability that this call will not tend because the plant will be hit by some, probably not an asteroid because we can see this coming, but by comment. That is kind of harder to see, I understand.
01:28:39
Speaker
So we shouldn't get the risk to zero, but perhaps there are, like if there is this pause and associate that realization that
Ensuring AI Safety and Regulations
01:28:50
Speaker
these experiments are considered georeclos by society, hopefully this will create some kind of incentive gradient for the companies themselves to figure out how to make them in a more responsible manner and more legible manner.
01:29:04
Speaker
So I am like I'm interested this project in arc alignment research center in Berkeley led by Paul Cristiano called evals like evaluating models about like what what are the things that that they could in principle do other things that they're still kind of incompetent and
01:29:27
Speaker
When you can have multiple opinions about this or different opinions about this, but there's a generalization of this, is it possible to replace the current blind as far as training runs with something that at every iteration, you do some tests that would give you some kind of guarantees about the alien that you're summoning?
01:29:53
Speaker
Yeah, there's a few other things that I think give me a little bit of hope that they might be developed on the kind of a technical level during that time. There was an interesting paper that came out from, I think this was pretty small scale. Yeah, I believe it was at Van Thropic where they experimented with doing the pre-training on a human preference data set from the beginning as opposed to just kind of random
01:30:21
Speaker
decent quality text off the internet. And it seemed that the upshot of that was that you never had quite as alien of an alien in the first place, you know, on measures of like
01:30:35
Speaker
harmfulness, you know, for example, helpfulness. It stayed much closer throughout the training process to kind of that final post RLHF or RL AIF state that we know now and never kind of dipped as far into
01:30:54
Speaker
the sort of just very strange alien territory of general pre-training. So something like that kind of changing the data set maybe from the get-go seems interesting. Another recent thing that jumped out to me was
01:31:10
Speaker
Somebody just I think in the last week or two just published a result where they showed that they were able to increase the size of the model progressively throughout training. It was kind of presented to me as like a efficiency thing that those go to show all these things are kind of pros and cons. You know, everything is dual use.
01:31:30
Speaker
But the net savings on the training flops was like 50 percent, which is obviously significant enough for people to take notice even for just purely commercial reasons. But then it seemed to me like, boy, there's something really interesting there where you're creating kind of this.
01:31:45
Speaker
seed, kernel, truly a little baby version of the model that is actually just a lot smaller in terms of its parameters. And then you're able to layer on more and more layers parameters as you go through the whole training process. It seems like maybe there would be a way to zoom in on that small thing and get something working right in the small
01:32:09
Speaker
presumably much more interpretable version first before growing the model itself and kind of just having one big tangled knot of parameters. I guarantee that it will not guarantee like safety in the long run, but it might just be enough to reduce the probability of destroying everything in the next generation. So we can actually kind of like do one more step in a way that is much more responsible than the current default. So one thing that you have not gone too much here is
01:32:40
Speaker
regulation, government intervention. I mean, the letter does call for that. If there's if there can be no pause, then government should step in and insist on one. But you're not putting a lot of, you know, maybe you haven't got to it yet. But I'm not hearing anything from you that's like government can come in and set up a regime that's going to do much for us. Is that your do you have any hope in a sort of government regulatory approach?
01:33:05
Speaker
Oh yeah, I do. Glad that you asked. This also reminds me that Google, in the form of Sundar Pichai, did react to the letter. There was a podcast, Hard Fork, New York Times podcast, where he said that the discussion started by the letter is great, but he thought that government intervention is necessary here, that it's not enough to rely on the labs self-regulating.
01:33:34
Speaker
I think one silver lining that I already mentioned is that this pre-training is super expensive and super visible. Therefore, the metaphor that I've been using is the nuclear control that the nuclear
01:33:57
Speaker
You also have these two phases. One is the hard step of enriching uranium, which is super energy intensive as well as much more visible than the second step, which is harder to control, which is the proliferation of nuclear-grade material. Here we have the similar situation where the pre-training actually happens just in a few places and is visible to governments.
01:34:25
Speaker
whereas the proliferation is already much harder to control. My model is that, and also, as I know, it resonates with thinking in the labs themselves, that if you want to have some constraints,
01:34:43
Speaker
on the AI trajectory than intervening in this compute. Having some kind of compute governance is probably a great thing to start from. In fact, this chips act and the export controls to China are already things that are happening and in some ways make the problem much, much more easier, although it's far from solving it. I think that makes sense. I mean,
01:35:13
Speaker
Certainly at the moment, the compute requirements are high. I do wonder, what do you make of things like the diffusion of just language model know-how, proliferation? I think we might have a guest coming up on the podcast that is building a
01:35:33
Speaker
decentralized GPU cluster potentially with some sort of blockchain governance where my M1 or M2 MacBook Air can contribute to a cluster. How long do you think we have before that barrier of just like mega compute resources goes away? Possibly because of these virtual clusters or possibly because of further algorithmic breakthroughs or just
01:36:01
Speaker
Model leaks could be another thing. How long do you think that holds? I mean, I got to refer back to this, like Eliezer's Moore's law of math science. But yep, every year or a couple of years, destroying the world becomes easier. So there's that. But also, if we do not get a pause, do you think that the world will be destroyed by one of the big labs?
01:36:27
Speaker
So just because the frontier models happen, it's much easier to train frontier models in a big data center than in a distributed manner. That said, if the pause does happen, then yes, we have to worry about things like proliferation, hackers stealing that
01:36:49
Speaker
stealing the weights and then doing experiments like chaos GPT, but with a competent AI, stuff like that. I think one of the maybe most kind of, I don't know if it's an argument or whatever, but maybe one of the things that's kind of hardest to argue with, in my experience, for those that are kind of objecting to the letter or to the concept of a pause is the sense that
Optimism and Risks of AI's Future
01:37:15
Speaker
someone might say, you're not giving me anything to hope on here really, right? You're just telling me like every generation it gets easier for the world to get destroyed. Like we've talked about buying ourselves some time here and there, but we haven't really heard much of a sort of here is the path we can take to safety. And so like why bother pausing if we don't even have a sense of where we're going.
01:37:40
Speaker
Do you have any hopes for concrete paths to safety that you would try to inspire that kind of person with? Yeah. I mean, sorry, human. I do have cancer.
01:37:53
Speaker
So it's like, you might be cured of it, but currently it doesn't look good. So it's like, I'm not going to lie. The prognosis is bad, but it's not hopeless. And also like the kind of like silver, massive silver lining is that if we do manage to survive the cancer, the future is going to be amazing. So like in some ways, kind of the expected value of the future,
01:38:20
Speaker
is not bad. It's just like the odds of survival are bad. But if you survive the life of the world, the universe could be potentially unfathomably better than it is now. So in a sense, we are living a lottery ticket and it is in some way in our control to improve the odds. And so that's what I'm doing.
01:38:49
Speaker
Well, that's probably about as good of a bottom line as we could hope for for this conversation. So I want to thank you for spending the time with us. I do have a couple real quick hitter, just fun questions that I usually end on if you have an extra second for those. We touched on this earlier.
01:39:08
Speaker
Any applications aside from like the obvious, you know, kind of usage of the core language models that you are personally just finding delightful or useful that you would recommend that people check out? No, I'm just like where are you going to pan with?
01:39:25
Speaker
limited to Tinkr with the language models. I've done a little bit, but then I'm trying to find coders at this point to delegate a bunch of my projects. Some of them might involve language models. Fair enough. You're in the majority on that answer. Most people are just using a few things.
The Future of Neuralink
01:39:49
Speaker
then second, let's imagine a world where we're here in a couple of years and Neuralink has been deployed to 1 million people. In this scenario, you are well, so you don't need it for
01:40:07
Speaker
restoring any functionality, but if you were to get a Neuralink implant in your head, it would give you the ability to essentially transmit your thoughts to devices. So you would have effectively thought to text or thought to UI control. Would that be enough for you to be interested in getting a Neuralink implant?
01:40:28
Speaker
Depends so much on details, like how reversible is the procedure, what are the risks, and what is going to demonstrate the upside. Will it become a better dancer as a result? Yeah, that one, they did show in their show and tell, they did show an animal where they were creating motor control through the Neuralink. But yeah, I think it was a long way from improving on your dancing skills.
01:40:56
Speaker
So I hope you are dancing for many years to come. Jan Talon, thank you so much for spending this time with us. We appreciate you being part of the Cognitive Revolution. Thank you very much.