Become a Creator today!Start creating today - Share your story with the world!
Start for free
00:00:00
00:00:01
Will AI Companies Respect Creators' Rights? (with Ed Newton-Rex) image

Will AI Companies Respect Creators' Rights? (with Ed Newton-Rex)

Future of Life Institute Podcast
Avatar
2.2k Plays14 hours ago

Ed Newton-Rex joins me to discuss the issue of AI models trained on copyrighted data, and how we might develop fairer approaches that respect human creators. We talk about AI-generated music, Ed’s decision to resign from Stability AI, the industry’s attitude towards rights, authenticity in AI-generated art, and what the future holds for creators, society, and living standards in an increasingly AI-driven world.  

Learn more about Ed's work here: https://ed.newtonrex.com  

Timestamps:  

00:00:00 Preview and intro  

00:04:18 AI-generated music  

00:12:15 Resigning from Stability AI  

00:16:20 AI industry attitudes towards rights 

00:26:22 Fairly Trained  

00:37:16 Special kinds of training data  

00:50:42 The longer-term future of AI  

00:56:09 Will AI improve living standards?  

01:03:10 AI versions of artists  

01:13:28 Authenticity and art  

01:18:45 Competitive pressures in AI 

01:24:06 Priorities going forward

Recommended
Transcript

AI and Copyright Concerns

00:00:00
Speaker
It said, we think that training on people's copyrighted work without a license is fair use. and And that just goes against everything I stand for. No one trained on copyrighted work without a license for a very long time.
00:00:14
Speaker
The commercial model, everyone knew it was illegal. What this whole copyright fight has showed me, maybe more than anything, is that a lot of the people at the forefront of building this stuff honestly seem willing to trample on people's rights in the pursuit of personal gain and profit. Trying to shift the Overton window, trying to move the needle towards any kind of outcome that is fairer than the current circumstances for creators, I think is really important.

Introducing Ed Newton Rex

00:00:45
Speaker
Welcome to the Future of Life Institute podcast. My name is Gus Docker, and I'm here with Ed Newton Rex. Ed, welcome to the podcast. Hey, great to be here. Fantastic. Could you tell us a little bit about your background?
00:00:58
Speaker
I am a composer, classical composer, and I've worked in AI for a long time. So when I left university, I started an AI startup, what we now call a generative AI startup, but this is in 2010.
00:01:14
Speaker
So we didn't call it that back then. We we used to call these things creative AI startups. And it was a music generation startup. This was long before the invention of of the transformer.
00:01:29
Speaker
this so I mean, this was this was, I started off by hand hand coding rules to compose music. And eventually we replaced that with recurrent neural networks. And, you know, we we kind of built it out, but it took eight or nine years.
00:01:43
Speaker
um I'd say we were probably about 12 years too early to the

AI Music Technology Evolution

00:01:47
Speaker
generative AI trend. We ended up selling the company to ByteDance, the owner of TikTok. And i went there and I took on a product role um working on the For You feed, which was which is very interesting.
00:02:04
Speaker
A totally new kind of thing. And I ended up kind of going via Snapchat as well and ending up at a company called stability AI in the UK, which is a big AI company in the UK, running the audio generation team there. So basically kind of doing what we've done at Duke deck for 12 years later.
00:02:23
Speaker
How had the tech improved in those 12 years? So how how was it different working at stability? It improved leap leaps and bounds. When we were doing this in 2010, really, we were making things up as we went along.
00:02:38
Speaker
You know, generative music had been around for a while. I don't think there had been any any startups in the space, but people have been working on it academically since like the 1950s. um But you basically had things like rule-based systems, you know, classical AI, Markov chains, hidden Markov models, these kinds of things.
00:02:57
Speaker
And the the tech was really rudimentary. You know, you were you were but and we were basically composing music note by note in a symbolic fashion. So we created the notes and the chords. We then used like an automated production system we built to to to produce that, to turn it into audio. it was basically the back end of a digital audio workstation that we built.
00:03:19
Speaker
But so the actual AI element was symbolic. It was creating basically notes and notes and chords on a page. And that's a totally different approach from you know, the the the cutting edge models today, and then when I joined stability, you know, which generating audio samples, they're generating raw audio, which is a, you know, ah which which means you have a a load more variety that can come out of these outputs. i mean, they're much, much more powerful systems now.
00:03:48
Speaker
And that's just the nature, yeah that's in the nature of these models getting, basically getting bigger, new innovations like the transformer and other things. diffusion, all of these things coming along.
00:04:00
Speaker
So yeah, I mean, it's a totally different world. But what's interesting is that the products, you know the the product visions aren't actually that dissimilar. And what people are doing now with AI music, you know, I mean, is really, you know, it's the same kind of product that we were trying to design back in 2010. It's just the tech has got a lot better.
00:04:18
Speaker
And for listeners who haven't heard AI generated music, i i can say that it's gotten incredibly good. just i'm and I'm not a musical person, but I have been fooled several times from listening to an AI-generated song and and thinking that this was ah actually produced by by a team of humans, basically.
00:04:37
Speaker
So it's it's gotten really advanced. Yeah. it's yeah I mean, AI music has come on a long way.
00:04:46
Speaker
It is now, I think it's reasonable to say that it's, you know, in many instances, it's pretty indistinguishable from human composed music. I think lots of people wouldn't know the difference hearing a couple of new songs if they didn't know the artists involved. ah It's not necessarily true in all styles of music, but it's pretty true. I mean, interestingly, it's still not true in classical music.
00:05:09
Speaker
where you, which makes sense, right? I mean, these models are optimized to, you know, to basically create music that's popular now to create pop music, you know. But, you know, pop music, including with vocals, including with lyrics, um you can generate really convincing stuff now, which, which you know, is is already out there in the market competing with people, with human musicians, and which I think is a big problem.

Impact on Musicians and Industry

00:05:33
Speaker
So, yeah it's pretty indistinguishable. But yeah, but classical music, is, you know, it turns out harder. I mean, you know, the intricacies of, you know, if something like a huge work by like J.S. Bach or something, right, from from hundreds of years ago, this is this is not yet doable.
00:05:54
Speaker
So so maybe maybe, yeah, I don't know, classical music some somehow is... is maybe safe. But for most of the market right now, AI music is is absolutely is absolutely here.
00:06:05
Speaker
Why is it that we still have human musicians then? if it's If it feels like AI-generated music is in indistinguishable from from human-produced music, why hasn't this kind of wave rolled over the music market yet?
00:06:21
Speaker
Well, it's still very early, right? I mean... even though there are people like me who've been in this field for now 15 years, you know, in general, the generative AI wave, you know, really capturing the public consciousness started in 2022. You know, ai music entering the public consciousness really started probably right at the end of 2023 into 2024.
00:06:44
Speaker
and into twenty twenty four So we're very early and a lot of people aren't noticing the impacts are there, but but they're invisible to a lot of people at the moment, right? Like, so for instance, already there are reports coming in from around the world of AI music being used in huge quantities to basically replace human composed music in stores, retail outlets.
00:07:13
Speaker
you know And these ah you know these are these ah these are foreign countries. you know These are places that you know maybe I haven't visited. people People thinking about this maybe aren't visiting that much. And it's happening already a huge amount. So I think i think we're in the very early stages.
00:07:27
Speaker
And we don't yet have the reporting that I think we will have you know that will actually show the extent of what already in mid 2025 going on That's one reason. I mean, I do think there is, there's definitely another reason, which is, you know which is the human musicians will survive, right? Like a category of human musician will be fine.
00:07:56
Speaker
You know, and I think that is the, everyone's going to be affected to a degree. But, you know, to put it very simply, you know, Taylor Swift will emerge from the AI age, you know, relatively unscathed.
00:08:09
Speaker
But the the problem is most people are not Taylor Swift. And so I think there is this, there's this argument from big AI advocates. And I'm not, you know, I'm i'm not someone who is against AI per se at all.
00:08:22
Speaker
But, you know, I'm also not what I call like an AI booster, you know, someone who just relentlessly, you know, tries to elevate any and every AI advance. And there are a lot of those people at the moment, right?
00:08:35
Speaker
you know, i think I think an argument that a lot of these people put forward is like, Taylor Swift's going to be fine. And, and you know, musicians are going to be fine. There are still going to be pop musicians. And of course there are people are still going to want to go and see live music. They're still going to want to connect with these musicians. But but the issue...
00:08:53
Speaker
is already that a lot of the long tail of how musicians can make money. And that is actually, you know, it's obviously that the the huge majority of musicians who are not household names.
00:09:04
Speaker
It's a massive problem for them already. it does also affect the household names, you know, to an extent in the long tail of how they make money. and And that it it's the it's these hidden areas of the music industry and the creative industry more generally, which are a massive, massive part of those industries, where you're already seeing, you know, the rug being pulled out from under people's feet.
00:09:26
Speaker
And I think that is the issue. Why is it? What are the reasons that Taylor Swift is going to be fine? We love human connection in the art we consume.
00:09:39
Speaker
That is particularly true in music. It's more true, I think, in music than it is in literature, where we immerse ourselves in the story. And often a lot of people don't really, you know, for better or worse, don don't necessarily think about who the author is very much while they're consuming it. When we consume music, when we go to a Taylor Swift gig, right?
00:09:59
Speaker
And I i mean, i confess I haven't been to a Taylor Swift gig, but I'm going to see Oasis. who are reforming, I'm going to see them in London in August. You know, we, it's the human connection.
00:10:10
Speaker
No one would be remotely interested in robots playing an Oasis concert, right? Like, or yeah, and and you'll get, you'll you know, you'll get a few kind of AI artists, I think, who kind of get big, you'll get so yeah as a, you know, as sort of a a circus sideshow, it'll be kind of interesting.
00:10:29
Speaker
But I don't think there's any risk of, you know, AI sort of fake musicians taking over the charts per se. You know, what what is, but again, what is going to happen and what is already happening is that a lot of the musicians who contribute to those songs going into the charts will find themselves outcompeted.
00:10:51
Speaker
I think a great example is songwriters, where already they're You know, songwriters, yeah I've done some songwriting sessions. I'm not a, you know, I i mostly write classical music. but I've done some songwriting sessions.
00:11:05
Speaker
you You all get in a room together, ah maybe a few songwriters. you might You might be with the artist. You might not be. you're You're kind of improvising. You're writing songs. Ultimately, you'll write, you know, people will write hundreds and hundreds of songs just for one album. So many song ideas and full songs get rejected.
00:11:22
Speaker
And already artists are starting to, not all artists by any means, but I've heard stories of artists starting to turn to AI song generators, you know, because, you know, it's just kind of easy for them and cheap for them to just go and get song ideas.
00:11:38
Speaker
And so you get to a stage where no one's going to be, there's going to be no kind of audit trail there. You won't know it's happening. It's almost certainly already yeah you know, you what you what you have is you have songwriters not being fully put out of work, you know, but gradually their work being eroded away.
00:11:58
Speaker
And I think that's what again, it's not we'll keep filling Wembley, you know, we'll keep filling these stadiums, people want to see pop stars. But most musicians are not pop stars.
00:12:12
Speaker
That's the that's the fundamental, that's the fundamental issue. Yeah.

Copyright and Fair Use Challenges

00:12:17
Speaker
We should talk about why you decided to resign from Stability AI. So, I mean, I was at Stability in 2022 and 2023.
00:12:28
Speaker
I was excited. I was really excited about building out and building out the audio team, releasing Stable Audio, which was, it was it was an AI music generator that we released in, I think, September 2023.
00:12:43
Speaker
it went down very well. we We licensed all of our training data. And so I've, you know, I've, I've built a whole bunch of AI music systems over the years. But key to all of them has been, you know, if you're using people's work to train on, to train these models on, you you pay them, you you you figure out a deal that works for them, you ask them permission.
00:13:07
Speaker
and And that's what we did for Stable Audio, which I was really proud of, you know, I think it was the at the time, it was one of the kind of first big, you know, kind of what I call contemporary generative AI models that was trained on data that was fully licensed.
00:13:26
Speaker
You know, unfortunately, the wider company, and frankly, the wider industry showed no signs of following that lead. And, you know, I mean, I I like to say that I didn't really resign from stability so much as I resigned from the wider industry, because stability, we're not the only company taking this approach. But it was, you know, it was them I was working for, they You know, it was in October, I think, 2023.
00:13:58
Speaker
And I woke up and I read an article, i think in The Verge. And it was all of these AI companies responding to the US Copyright Office. The US Copyright Office had just put out this request for comments on AI and copyright. And interestingly, I mean, actually, just a few days ago, they're they're at their final stage of their report ah finally came out. and we we should talk about that.
00:14:22
Speaker
I think it's a great report. But this was when they were gathering evidence and they were asking for submissions. And all of these tech companies made these public submissions. And there was a list of them. And I saw in this article, hey, stability is listed. And I thought, OK, I'll have a look at that. ah you know And i was on the yeah I was on the leadership team at Stability.
00:14:40
Speaker
and And I read it. and And basically, I mean, on the first page, it said something to the effect of, you know, we think that training on people's copyrighted work without a license is fair use. You know, basically, is there they think there is this exception that covers it.
00:14:58
Speaker
and and and And that just goes against everything I stand for, everything that I had stood for in the audio team, everything I stand for in general. And so that that was kind of the trigger, this kind of public statement.
00:15:13
Speaker
so but I mean, Stability had been training their image models for a while. and you know i knew you know I knew the attitudes of the rest of the company. But honestly, I i had hoped that by building ah model that went down very well. i mean, our audio model was named one of Time's best inventions of 2023.
00:15:30
Speaker
you know I mean, it was it it was a really good, I think it was an industry-leading music generation model at the time. And it was licensed. and And I hoped that by showing you could do that. And I still truly believe this. I think most of the reason you don't see leading models trained on licensed data is because people just can't be bothered.
00:15:49
Speaker
They're leading on the fair use defense. So you don't have, and the US Copyright Office said this basically in their report. They said, look, like licensing is hampered because so many people relying on this fair use defense. Obviously, if you've got a whole industry that's copying a few big players,
00:16:08
Speaker
you know, who who rely on the fair on this fair use defense, who refuse to go and license all their training data, obviously, licensed models are going to suffer as a result. And that's, I think, what's happened.
00:16:21
Speaker
Would you say the general industry attitude is just that, hey, we can train on on copyrighted material, and this is covered under fair use, we don't have to license anything?
00:16:33
Speaker
Yeah, that's absolutely the standard industry attitude right now. I mean, it's really interesting, right? Because I've been in, I've actually been in the industry longer than probably almost anyone.
00:16:46
Speaker
And so I saw it develop throughout the 2010s, and through into the early 2020s. And no one trained on copyrighted work without a license for a very long time.
00:17:00
Speaker
ah trained a commercial model, everyone knew. Everyone knew it was illegal. Like every like this was like it was it was like standard common knowledge. it's It's interesting if you look at like, Google, Google had this fantastic team like a little bit after we publicly launched our AI music company, we we launched in like 2014. I think in 2015, Google launched this team called Magenta.
00:17:23
Speaker
um Really cool projects. run by like run by some really smart people. And it was basically looking at creative AI, as we called it back then. And they trained these generative AI models. Honestly, when they when they launched, we were super worried about it. We were like, Oh, my God, we got competition.
00:17:38
Speaker
And we needn't have worried at all. Because like, we were all a decade too early. But you know, they they launched these models, they wrote papers about these models, they wrote blog posts, and they call out in these papers and in these blog posts, like, you know, this is our data, like, here's where we've got it, we've gone and commissioned training data, like they, they built this AI drummer, they went and commissioned all these pieces of training data. And that's what we did as well. We commissioned training data.
00:18:08
Speaker
And so, you know, and obviously you compare that to Google's approach now to generative which is obviously very different. And so I think, you know, what happened is in 2022, you'd had all these, but this is my impression, you had all these research models, people were like researching, and there's always a better argument for using copyrighted work unlicensed, if you're not doing anything commercial, right? Like for research, I think that there's good, art especially in academia, I think there's there's a good argument.
00:18:39
Speaker
And so people did this research, they found these things worked really well. And then like, two or three companies that are now the most famous companies in the world, through caution to the wind, thought, let's just release this, let's see what happened. And and you know, the story from there that the industry took off.
00:18:56
Speaker
And everyone copied them. and Immediately, there was this like, gold rush. Everyone copied them. Everyone saw them relying on fair use. Everyone assumed, yeah, well, this must be I mean, they're, they've raised billions of dollars. They're, they're one of the most valuable companies in the world. They're, they're showing what can be done. Like, you know, if they can get away with it, surely we can too.
00:19:15
Speaker
And and it's it's rapidly become the kind of standard approach. And it's It has, I mean, it it has massive issues for people. I know a bunch, i mean, through my work, through through Fairly Trained, through just my work in general over the last few years, you know, I know a lot of,
00:19:30
Speaker
people who run AI companies that are trying to take a fairer approach, you know, companies that are licensing all their training data that are really working with creators. And, you know, every, every company says like, we're all about, you know, we're all about creators. We want to democratize creativity. Like, you know, we want to treat them well, but with most of these companies, it's garbage, but there are a few who are actually licensing their training data and that's what creators want, right? They want they want to be asked permission before their work is used.
00:20:00
Speaker
But they ah having a really tough time. And one of the reasons they're having a tough time is because they're going to try to raise money to raise capital. And investors are looking at their pitch deck, and they're stroking their beards.
00:20:14
Speaker
And they're saying, well, but hang on, your, your expenses are going to be higher than the people who are taking their training data for free. So like, you're not going to win. So we're not going to invest in you.
00:20:25
Speaker
And,

Open Source Models and Transparency Issues

00:20:26
Speaker
and so you have this horrible cycle, where it it's not just the AI companies who are basically, in my view, stealing all this work and training on it, you've got a whole industry around it, the and and you've now got this whole industry that that is is desperate for fair use to prevail in these lawsuits for the AI companies to prevail, because if it doesn't, they're worried that the whole thing falls apart.
00:20:53
Speaker
um So it's a we've got into a really bad position, unfortunately. And we didn't have to we didn't it didn't have to go this route, which is what's so annoying. you know And we should say, it if you're a company is trying to produce a licensed model, you're also competing with open source models that are also trained on all all of the data available on the internet and some of the some of some data that's not even publicly available.
00:21:19
Speaker
This is not. unique to music. This is also images. This is text. the this ah We're talking about books and articles, videos, movies, and everything. All of the top companies have collected all of the data available, and they're now training on it.
00:21:37
Speaker
And they're now producing synthetic data from that data. So they're doing everything they can to so gather as much data as possible, basically. Yeah, yeah. And agree. I mean, I think the open source thing is interesting, right? Because open source, I mean, for a start, lots of these models obviously aren't open.
00:21:55
Speaker
They're not open in the traditional sense, because they're not revealing their training data. And they're not revealing their training data, because they know they'll meet, they'll be immediately sued into oblivion if they do. But open weights models, I guess we can call them
00:22:10
Speaker
I think they're interesting as well, because there's this, you know, that in general, I mean, I think open source obviously has benefits. Open source has led to a lot of great innovation.
00:22:22
Speaker
You know, in this, in this context, though, there is this kind of, there's this almost kind of religious adherence to this idea that open must be good.
00:22:32
Speaker
um And what what you have with a lot of open source models, is you basically have organized companies or organizations going out there training and releasing an open source model, and sort of arguing at least like maybe externally, certainly internally,
00:22:49
Speaker
that because it's open sourced, they have less, there's less reason to license, maybe they're not commercializing, right? Maybe one of these open models, they're not directly commercializing. But I think this is like incredibly misleading, for a couple of reasons.
00:23:02
Speaker
One, often the companies that are really invested in open source and building open source are doing it commercially, they may not be charging for the models. But they're absolutely doing it commercially. They're doing it so they can, ah because they're massive trillion dollar companies, and so that they can attract, you know, the best engineers in the world who want to work on open source, they're doing it.
00:23:24
Speaker
So they build out their ecosystem, um the the ecosystem around their products, their models. I mean, it's 100% a commercial thing. It's not just some like, philanthropic exercise.
00:23:36
Speaker
you know, which I think is important. And secondly, open, like truly open models, of course, can be can be used for anything. They're there on a truly open model, there there is no downstream limitation to how the model can be used.
00:23:50
Speaker
And that then throws a big spanner in the works for fair use defenses, because as the copyright office made clear just this last weekend, when they when they put out their report on on AI training,
00:24:10
Speaker
how you how a model is ultimately used comes to bear on the question of whether it is a fair use of the data that trained it.
00:24:22
Speaker
You can't just sort of train a model and say, well, look, when like we're training a model. we're not We're not creating music. We're training a model. Other people are creating the music with the model. like That doesn't fly. like you it's about what the model is, it's obviously about what the model is used for.
00:24:38
Speaker
And so and an open model, you know, you can't put restrictions in if you if you try to build in guardrails that won't output copyrighted work, the copyrighted work that you trained on, those guardrails will just be removed.
00:24:52
Speaker
And a lot of this comes down to like potential as well. It's not with fair use. It's not just about what is actually being done. It's about what are you for? What are you potentially facilitating? And so so i think a lot of the fair use arguments have a lot of trouble with open models.
00:25:09
Speaker
And in I mean, in general, I am, I'm really wary of open models in, you know, for creators, and as as regards creators, part of the reason being that, you know, open models are irreversible, you know, you can't take a about a closed model, you can turn off, like, I strongly believe that, as the US copy, the US Copyright Office said, basically, look, we think that some AI training is probably fair use, and some isn't.
00:25:35
Speaker
You know, so I think we should expect that some rights holders lawsuits will be successful, and some won't. Fair use is determined on a case by case basis, you'd expect that different cases would go different ways.
00:25:48
Speaker
So so therefore, you should expect that some AI companies are going to have to turn off their models, they're gonna have to retract them, right? Now a closed company can do that.
00:25:59
Speaker
A closed AI company can do that and can turn it off and then it's not accessible anymore. You got an open model out there, there's nothing you can do to get that back. You can make use of it illegal, but it's gonna be very hard to police.
00:26:11
Speaker
so So yeah, I mean, i while am a big advocate of open source in some areas, I think it has real, I think there are real issues with open source and and copyright basically.
00:26:23
Speaker
You've lost launched this organization called Fairly Trained, which is trying to set a new standard for the industry. Could you tell us about what you're trying to do here?
00:26:35
Speaker
Yeah, so Fairly Trained really came out of conversations I had kind of immediately after leaving Stability, where it it kind of ended up blowing up a bit in the news, I think a bit because, you know, while a lot of creators who I really applaud had been, you know, had been flying the flag for creators rights and trying to shine a light on this issue.
00:27:03
Speaker
Most people in the AI world had been pretty silent on it. And so here was someone from the AI world who was saying, actually, no, this is this is not legit. Like, you know, we, we should not be just stealing people's work and like to make money off it. This is terrible.
00:27:15
Speaker
And, and I think because of that, it, you know, it got a bit of news coverage. And one of the things I found was that journalists were sort of saying to me, ah that's interesting. Like, I thought AI could only be built by stealing people's work.
00:27:29
Speaker
And, you know, i was I was kind of shocked, but there's no reason they would have known otherwise, like, you know, a lot of the the handful of models who were doing things legitimately,
00:27:43
Speaker
were not that well known. That, you know, as I say, they've been like struggling to, some of them been struggling to raise money, not all of them, but you know, some of them have been struggling to raise money, like, you know, it's it's hard, you know, it's it's hard to take the take the right path.
00:27:57
Speaker
And so I thought, well, you know, we should do something about this. We should, we should highlight the fact that there are these companies, we should, we should try to help them. We should also try to help people understand that this is a viable option. You don't have to use, you know, plagiarism machines.
00:28:15
Speaker
You know, you can, you can go and use models that, that are built fairly. um So that's where it came from. And so, you know, the idea that we landed on was just a very simple certification for AI models that aren't trained on copyrighted work without a license.
00:28:33
Speaker
And so we have a certification process that these companies go through. I think we've certified 19 to date. Across a range of modalities, there's music, there's voice, we've actually got one large language model.
00:28:47
Speaker
There are other modalities. And, and really, and that's the purpose. And so we've got like, you know, there are some companies who have said, look, we as companies are only going to use AI models that are fairly trained, or the at least meet this bar.
00:29:03
Speaker
You know, and I mean, I don't, I don't care if people I'm not, i'm I'm not, I mean, Fairly Trained is nonprofit. I don't pay myself. I'm not, I'm not in it to make a big success of Fairly Trained. I'm in it try to elevate these companies.
00:29:16
Speaker
And so I don't, you know, I don't mind if people like take, certification badge as gospel, or if they do the diligence themselves, and make sure these companies hit the same bar, I don't mind at all.
00:29:28
Speaker
So some companies do that some use our certification mark, some just have their own diligence, but try to hit our criteria. And these companies are basically saying, Look, if we, if we're going to use AI, as a company, we're only going to use fairly trained models.
00:29:42
Speaker
And I think that's really good. You know, but at the same time, we're not gonna affect the public's feelings on this, like, the, the public are always going to use just the best and easiest model that is available to them.
00:29:57
Speaker
You know, and I don't fault people for that at all. You know, I mean, before there were legal ways of streaming music, a whole bunch of people used Napster, and the like, right? Like, you know, if it's if it's easy, you're just going to kind of go use it and until there are you know about the really viable alternatives, right? And so Yeah, so I don't think we're going to affect the public consciousness that much.
00:30:26
Speaker
But I think that's okay. Like, I think that what we provide is something that people and companies who care about this can turn to. And so we try to gradually change views that way.
00:30:38
Speaker
But also something that, you know, legislators around the world can look to. and and and And just point to as an example, that this is possible. That's one of the things that kind of honestly frustrated me the most, you know, two years ago, one and a half years ago, was that, you know, there was this idea generally kind of put forward by AI companies that like it was um impossible not to do what they're doing. They have to.
00:31:05
Speaker
and And I hope that what we what we're doing at Fairly Trained shows that that's not the case.

Synthetic Data and Copyright Laundering

00:31:10
Speaker
How is it possible to certify the data that goes into training a model? What are you doing at a technical level?
00:31:18
Speaker
Well, I mean... Like many certification schemes, it is, I guess you could describe it as a self certification scheme. I mean, it's not they're these companies aren't doing it themselves, but it's kind of based on they provide information to us.
00:31:33
Speaker
There are ways you could technically scan data sets, but it would still be based on trust if you did that, because you'd have to trust that the company was giving you all their data.
00:31:45
Speaker
There is at the moment, no way of actually scanning kind of a, you know, taking a taking a model, and obviously, and just reverse engineering, here's a list of all the data. So we we can't do that. That's that's off the table.
00:31:58
Speaker
So we have to have some trust based mechanism. So what we do is we have a process where companies submit a bunch of information in response to questions that we give them, that we pose them. ah they submit that They submit lists of their training data.
00:32:14
Speaker
And then we go and check that. And it's like, sometimes it's very easy because there are companies that There are companies we've certified that, you know, aren't, for instance, large language models and don't use a ton of different data.
00:32:26
Speaker
And sometimes it's difficult. And there's a there's a ton we have to go through. And we have to go and look at all the sources. And we just we just drill down and drill down until we're until we have a high level of confidence that these people have, you know, that this data is is clean, essentially.
00:32:41
Speaker
And so that's how we do it. And there are you know, there are other parts of it. There are other parts of the certification, like having good internal processes to make sure that this is, you know, these these standards are met going forward, that sort of thing. But the crux of it is, what is your training data?
00:32:55
Speaker
And so that's how we do Do you think a certification process could ever scale to some of the big players, OpenAI, Google DeepMind, Anthropic, and so on? Yeah, I think so.
00:33:07
Speaker
i mean, look, fundamentally, the the biggest issue the biggest thing stopping that is transparency, you know, is the or lack of transparency.
00:33:19
Speaker
Like ultimately, if these if these companies would just reveal their training data, publicly, then of course you could check it.
00:33:31
Speaker
It might take time, depending on the kind of data they've used, depending on where they've got it, it might take time, you but you could automate parts of that relatively easily. the the checking There is no issue with the checking side. The issue is in the disclosure and the lack of disclosure.
00:33:47
Speaker
And there's a fight going on in various places around this right now. I mean, the UK, where I'm
00:33:57
Speaker
the House of Lords has twice proposed to the government, like really simple, additional bit to a bit of legislation that just basically says, AI companies must disclose the training data that they use, right? That that's basically it.
00:34:15
Speaker
Which to me seems like common sense. You know, there is, there are AI companies argue that it's kind of their secret source. But it's not like the the the sources of your training data, maybe what you do to the training data, you know, that there is some there are some trade secrets there, right? Like, how do you augment the training data? How do you filter it?
00:34:37
Speaker
What do you ultimately choose to use, but the, but the sources, like where you've got it, That's not a secret. I mean, for a start, everyone is just getting as much as they can, right? They're just like, there's no secret to that.
00:34:50
Speaker
Secondly, like, it's, it's trivial to, to, to, like, for instance, if you like, run AI music company, it's absolutely trivial to like, come up with a list of, you know, like, really all of the people you could license music from, like all of the big companies you'd go license music from.
00:35:10
Speaker
I mean, I know I've done that, you know, like it's you can do it in an afternoon. You know, there is no secret to where you get data. And so so I don't buy the argument at all that it's like a, like a trade secret, like what your training data is. I think it's I think it's obviously these companies are saying that because they know that if they reveal their training data, they get sued.
00:35:32
Speaker
Like that's why that that that's what would happen. And, and I think a bunch of people would win those lawsuits. So that's why they don't want to reveal their training data. But that's, that's all it that's all it would take. And so in the UK,
00:35:45
Speaker
the House of Lords has proposed this a couple of times, the the the government in the UK has rejected it, you know, and on, ah you know, using arguments that are basically based on procedure.
00:35:58
Speaker
um But really, you know, the the reason they're rejecting it is clear. It's because they are very close to the big tech companies. They want the big tech companies to keep opening up like 100 person offices in London and whatever, like,
00:36:14
Speaker
boosting job count a little bit. And, you know, i wouldn't go so far as say they've been bought by tech companies. But, you know, they're clearly they, you know, place tech companies, US tech companies interests over their own creative industries in their own country's creators. And so they're rejecting these amendments to bills that would literally just, just make AI companies fess up to what they're training on.
00:36:41
Speaker
I mean, that's all it would do. So and you got to bear in mind well that training on copyrighted work in the UK is straight up illegal, like there's no fair use debate.
00:36:56
Speaker
There's no, you know, that there's, you know, there's not even debate around this, like you just can't do it in the UK at the moment, which is great, which is good law, you know, and and shows the strength of our copyright system.
00:37:12
Speaker
So yeah, I mean, there are big fights going on around this kind of stuff. I agree that if the companies showed what they had been training on, it would be revealed that they had been training on copyrighted data.
00:37:24
Speaker
But I also think there might be special cases in which the companies have paid scientists, say, or researchers a lot of money to produce data. very valuable training data that's not publicly available, or or they might have generated a bunch of synthetic data that's also difficult to to produce. So how would you, would that fall into the category of of more of a trade secret? Would that be the secret sauce that that they can't reveal?
00:37:54
Speaker
No, I mean, I think there's a difference between actually revealing your training data as in sharing the actual data in a big in like an S3 bucket, and letting people go through the actual words you're training on or whatever it is, and, and lists of training data.
00:38:10
Speaker
And I think that I think that's a critical difference. Like, I wouldn't say like, if you've gone and commission, and this is we do this with fairly trained, right? Like, we don't ask to see all of the data that you've commissioned.
00:38:21
Speaker
What we ask is we we need to know that you've commissioned that data like that, like, here is our data, we've commissioned it, you don't have to show us all the words that are in that data, you don't have to, you know, yeah, that's totally fair enough. That would I can see why that would be secret. There's no reason people need to know that.
00:38:37
Speaker
But, you know, ultimately, that's not the argument that is being had. It is like, should we have transparency over like lists of training data? You know, you should basically be able to like, this is the key.
00:38:50
Speaker
The copyright holders should be able to they need enough information that lets them know if their work is in the training set.
00:39:01
Speaker
that like That's what they know. And at at the moment, they don't remotely have that, right? Because like that there is just no transparency at all. And and they have to go and like they have to go and red team the models to try to find out.
00:39:13
Speaker
And it's hard work. you know and it's and And obviously, that just like stops them exercising their rights. so So yeah, I totally agree with you. I think they're absolutely and the same with the same potentially with synthetic data. Synthetic data is interesting, right? Because synthetic data itself, I think can be a way of laundering copyright.
00:39:32
Speaker
You know, like, if you use ah model that's trained on a ton of copyrighted books, and then you create a load of synthetic data, and then you train a new model on that synthetic data,
00:39:46
Speaker
you know, in my mind, you are infringing copyright just as much. And you are and yeah and you're doing as much harm to the authors as you would be if you were just if you cut out the middle step of the synthetic data training.
00:39:58
Speaker
So when we ah fairly train when we evaluate synthetic data, you have to have you have to meet the same criteria with the whole chain of models that was used to create that data.
00:40:11
Speaker
um So you can't just sort of, you know, wipe your hands of it and say, well, we only use synthetic data because we have we we say, where does that synthetic data come from? so So I think that I think that should be included. And actually, I mean, that's a major issue with a lot of the transparency regulation that has been proposed.
00:40:30
Speaker
I think in general, legislators have missed the synthetic data problem. I think it's always missed, you know, that they will say, provide us with a list of the copyrighted works that you've trained on.
00:40:40
Speaker
And I don't think that's nearly, nearly enough. you We need a list of the training data. Because then if a bunch of that is synthetic data attached to that should be an explanation of the models that came from.
00:40:51
Speaker
And similarly, the training data that went into those models. And if you didn't train that model yourself, which you might not have, you at least need to tell, you needs need to disclose what the model is. Again, the ultimate test should be, can a third party looking at this list you provide reliably go and check it themselves and find out if their work is anywhere in that training stack?
00:41:16
Speaker
And that that should be the tip And anything short of that, I think is not good enough. And I think that should just, like, it's a very simple bar to to try to hit. When companies provide a list of the data they've trained on, wouldn't it be fairly easy to simply exclude data that they don't want anyone to know that they've trained on? So you can't prove a negative. You can you you can you can only look at the the data or the info they've provided on the data they've trained on.
00:41:46
Speaker
But they might be running another training run using a whole bunch of copyrighted data. How how would you deal with that issue? Yeah, I mean, I think you deal with that in two ways. i think you do like ultimately, at the society level as well. And when this, if we actually get transparency, legislation, then I think as as part of that, you ought to have audits. I think audits are key.
00:42:09
Speaker
to that kind of legislation. And then I also think that you have, you know, this is where red teaming can come in as well. You know, because I mean, like, ultimately, if you say you haven't trained on Harry Potter, and people can get Harry Potter out of your system, then you're obviously lying.
00:42:27
Speaker
Right? So so I think a combination of audits and red teaming can solve that can solve that issue pretty well. On the point of synthetic data, Do you think it might be too late to fight this fight if we have open source models that can generate fairly high quality text and images and potentially music and that can then be used to train other models? So in some sense, the cat might be out of the bag because you can't, as you mentioned earlier, you can't remove these open source models from from the world.
00:42:59
Speaker
Yeah, I mean, I don't think so. i mean, AI companies, obviously, that's a that's an opinion that AI companies love, right? Like, it's too late. It's too, don't bother. Don't bother regulating this. Like, it's too late.
00:43:11
Speaker
I don't think so. Like, for a couple of reasons, you know, we are, one, like, yes, there are open source models out there. But a bunch of them are almost certainly, in my view, a bunch of them will be found to be breaking the law in how they're built.
00:43:29
Speaker
You can't take them back, but you can forbid people from using them. You can make it illegal to disseminate them. You can make it illegal to host them. You can make it illegal to use them.
00:43:40
Speaker
It's not going to stop all use, but honestly, it's going to stop a lot of the use, right? Like you you you solve a lot of the problem by just making by by saying no, you you you can't host that. most people Most people don't want to break the law.
00:43:54
Speaker
Some people do, but most people don't. So I think that's one reason it's not too late. Another is... and Another is this kind of current issue of model collapse.
00:44:05
Speaker
Now, I'm not... I actually am not really very bought into the idea of model collapse. There's a, there's kind of a ah hope among the, I guess the sort of anti AI crowd, which I don't really consider myself part of, but but I know a lot, but my views on copyright align with a lot of the anti what I call the anti AI crowds views. And so I know quite a lot of them.
00:44:25
Speaker
And there's this kind of hope that you could never train a model on purely synthetic data, because it would lead to model collapse, like basically, it just all becomes like, it's it's just not good enough. And it leads to the model doing really badly. And there are some signs that that's maybe true at the moment or has been true recently. But i I don't see any reason why it would be true in the long term, why it would hold as a general rule.
00:44:48
Speaker
um I think it's similar to I mean, frankly, 10 years ago, 1012 years ago, no one believed you when you said AI will one day be able to create art and write music and text in a way that is as convincing as people.
00:45:05
Speaker
No one believe like they thought you were crazy. They thought well there's no way this could happen. and and And they were wrong, you know, and I think that model collapse is another thing like that. It's a very temporary, to me, it seems like a very temporary limitation.
00:45:18
Speaker
So I do think that will get to a stage where you can train very highly performant models purely on synthetic data.
00:45:28
Speaker
I think that's likely at some point, there are already signs signs it's possible. so So I don't think i don't think sort of we could rely on model collapse as like a get out of jail free card. And it's another reason why I think like rapid regulation is important, like rapid reporting to account. and I don't think it's too late, but I do think Time is of the essence.
00:45:49
Speaker
You know, I absolutely think that time is of the essence. Yeah, on the model collapse point, I mean, we shouldn't rely on the intrinsic features of a technology to to kind of cross our fingers and hope that it all works out because the models will be limited. I think i agree with you that we are seeing early signs in in reasoning models, for example, that synthetic data can can lead to quite impressive results. So I wouldn't hold out hope if you have the perspective that model collapse will will prevent the models from ever kind of violating copyright in in bad ways.
00:46:25
Speaker
Yeah, yeah, I tend to, I don't know, I tend to be, it's funny, what while like, I have spent the last one a half years really advocating for, I guess, AI development to take a pause and just say, hang on, like, are we all building our models based on, based on theft?
00:46:44
Speaker
You know, at the same time, I mean, I do like, I a, I guess, um I don't know if I'd call myself a futurist, but like I i tend to think that technology is going to advance a lot, that it's going to bring a lot of benefits, it'll bring massive risks as well.
00:47:00
Speaker
One of my general opinions you know is that like with AI, ultimately, we know humans are intelligent. We know that we've managed to... yeah Intelligence has already come about once. There is no reason that... To me, there's no reason that everything we can do won won't one day be possible, be be sort of physically possible. And ah in in machines, it seems pretty self-evident that there's no that that it's not impossible because it's already been achieved once.
00:47:31
Speaker
And I think that that's my starting point. And it's why back in 2010, I was convinced that, you know, ai would be able to write music before long. And I was a bit off in my timelines, I thought it would happen a little bit. soon I also I also thought music would come before art. And I was wrong about that, like art slightly beat music, for I think interesting reasons.
00:47:50
Speaker
But, you know, ultimately,
00:47:55
Speaker
the tech is going to be able to do this stuff. And so if we tend to, we have this problem where we get really hung up on the issues of today, and they're going to be solved, they're going to be solved, they're going be solved, probably from corners we don't expect, it might not be OpenAI or Anthropic who solves these things, it might be some new startup, right?
00:48:16
Speaker
But they're going to be solved, and they'll be solved in the next few years. And then, you know, and then what? You know, so you you can't you can't kind of rest on your laurels and think, well, it's fine, because it's fine, because machines can't do this, I think is always, always a bad argument, and always a dangerous path to go down, because almost always, they will be able to do that thing within probably a few years time.

AI's Potential and Economic Impact

00:48:41
Speaker
It's very important here to notice the pace of change also. If you say in 2020, models can't do this simple task, and then you just wait a couple of years, well, then maybe they can.
00:48:53
Speaker
And I expect the same thing to happen over and over again, basically. I agree. the the like It feels like we're still in a current cycle of that happening. I mean, I remember meetings at Stability. It still feels like yesterday was at Stability.
00:49:06
Speaker
yeah I remember meetings in 2022 at Stability where we were still talking about like, you know, when is the year that AI music is going to break out? Like we will I, you know, I, I was like, it's gonna be I i was like 2023. By then it was very close. It was kind of easy to predict.
00:49:21
Speaker
I got my prediction wrong back in 2010. I thought it would be be done quicker than it was. you know, but these, even back in 2022, music could like had been far, far, far from sold, right? Like it just, you know, it wasn't, it wasn't working really.
00:49:38
Speaker
and and And, I think, and and this is true across modalities. i mean, it's happened in video. It's currently we're currently, which it's currently like the video cycle, I guess. I think it's, it's happening pretty quickly in robotics, where you're seeing In my mind, you know, robotics advancing very rapidly in many ways, thanks to um training models in simulations and then kind of transposing that over to the real world.
00:50:10
Speaker
You know, I wouldn't be surprised to see it happen in other technologies like brain computer interfaces and things like that. Yeah, and I just think it's so ah it seems to be almost part of the innate human condition that we we, we extend out from where we are now. And we find it, I think, often very hard to picture and believe in rapid change as possible.
00:50:34
Speaker
um We take today's limitations, and we imagine that some version of them will still be there in a decade's time. And I think that's a mistake. This is quite interesting because a key question here is how does the debate around copyright and fair use fit into these larger questions or grander questions about the future role of people and the the future of humanity?
00:51:00
Speaker
what's What's the connection there? Well, I think there's a couple. I mean... I think there's a big, big question around work. And this is something that, I mean, I'm, you know, i am someone who I do worry about the downsides of AI.
00:51:18
Speaker
You know, I mean, I, I'm excited about some of the upsides I'm worried about. I'm anxious about some of the downsides. I personally think that like one of the biggest risks in the near term from,
00:51:30
Speaker
you know, let's call it general intelligence or super intelligence or hyper intelligent AI, like basically AI systems that are supremely capable. I think one of the one of the big issues that we're going to face in the near term is is the potential mass displacement of labor, you know, and I think I think the creative sector is kind of the canary in the coal mine here, we, you know, generative AI in this way has only been around a couple of years, we are already seeing data that shows that creative work is the creatives are being outcompeted. You know, there are there's data on this already from Upwork. there's There's a bunch of papers that show that, you know, freelance writing tasks, freelance ah graphic design tasks fell in the wake of and never recovered in the wake of some of these models being released.
00:52:18
Speaker
you know I know people whose income has just totally fallen because, and they know, they've been told that it's because the people who previously employed them are now using AI. you know and And so I think creatives are kind of the canary in the coal mine here a little bit.
00:52:32
Speaker
I worry when I see, you know, I see company like just just like a month ago, ah company announced its intentions and this company is called Mechanize.
00:52:43
Speaker
You may have seen it. There was a big big kind of kerfuffle around it about a month ago. And, you know, it's got investment from Jeff Dean, you know, Google. It's got investment from... Nat Friedman, Dwarkesh Patel, like all of these like big names in the world of tech or tech adjacent fields, right? And I interviewed one of the founders of the company recently. Yeah, there we go right. Yeah. ah You know, the the the I confess, I haven't heard that I will listen to it. But the mission The mission is automate all work.
00:53:10
Speaker
you know like So this is not like some... And if you look at the investors behind this, this is not some sort of fringe movement. right This is not some fringe idea in Silicon Valley. There is a real body of thought that says, yeah, like...
00:53:24
Speaker
given the combination of general intelligence, possibly super intelligence, and general purpose robotics, there is like a real possibility that actually we can automate, if not all work, and let's be honest, it's probably not all work, right? Like, it's not going politicians, it's not be priests, it's not going sports people, you know, but there is a real possibility that we can maybe automate a huge amount of that work. And God, we're going to try because this is, you know, I mean, it's the It's the kind of Marc Andreessen, software is eating the world idea.
00:53:57
Speaker
Silicon Valley, you know, loves to, I mean, it's, it's ultimately about making money. Like it's ultimately about, you know, trying to replace different sectors. ah And, and ultimately I think if Silicon Valley and I use Silicon Valley, obviously, and in I don't really mean that. I mean, I happen to be here.
00:54:16
Speaker
What I mean is this kind of philosophy, this, this idea in the tech industry, but you know, there is this, this kind of idea that, hang on, maybe this is our chance. maybe like all of these holdout industries where we haven't really been able to make inroads, maybe this is our chance.
00:54:30
Speaker
And actually, maybe this is how we get into them. Maybe this is how we take them over. And so that I think is... a big, big thing. And it's something that I'm worried about. And, and I, and to me, like, there are arguments about whether we'll get there with the current kind of paradigm of large language models of reasoning models of robotics of where we are now.
00:54:57
Speaker
um And I think that's a sensible, like debate to have, I'm i'm not sure whether we will. I wouldn't say we necessarily will. but But I think I'm concerned about two things, irrespective of that. One is, there is such a huge amount of investment being poured into automation that I think it's very likely that we get new innovations coming along.
00:55:16
Speaker
So such that even if large language models large language models don't end up being the route to AGI, something else might well be in the near term, right? um I don't think we should rule it out I think we should consider it a significant possibility.
00:55:31
Speaker
And the second thing is just the desire to do it at all. Like the aim, the fact that it is people's aim to go and automate all work. And I understand, you know, the position that sort of says, well, you know, the the fully look fully automated luxury communism or whatever, right? Like, let's all just relax the whole time. Let's basically have early retirement.
00:55:51
Speaker
But I think that's like a very naive, a very naive view. given how political establishments work and given how obviously totally unprepared the world is, the US is, any major economy is for the rapid displacement of labor.
00:56:08
Speaker
So yeah, so that's a big concern for me. Yeah, but perhaps a glib question here is to ask, why is this a bad thing? If you think of a a peasant 300 years ago Maybe there will be some worries about automating agriculture ah to a large extent. But and historically, this has turned out great. we We've been able to massively increase productivity and living standards and so on.
00:56:34
Speaker
Why isn't this just the next kind of turn of the wheel of that trend? So in sense, why should we be pessimistic here? Well, i think I think one reason, I'm not saying we should only be pessimistic, by the way. like you know i think I think we need to be very alert to this possibility and we need to be like acting accordingly. It doesn't mean we have to be pessimistic about it.
00:56:55
Speaker
you know I think one difference is, which I think is pretty obvious, is the nature of the technology we're now building, which is much more general, which where where the intention is to be general purpose.
00:57:07
Speaker
um and And this makes it, I think, very, very different to many of the revolutions of the past. Like, again, simply the fact that you have some of the biggest names in the tech industry, investing in a company whose mission is to automate all work, shows that there is there is a different idea here at play than there has been previously, right? so So I think we should take that seriously.
00:57:32
Speaker
So then I think the question is, okay, well, if you assume you hit that, why why is that an issue? Wouldn't it be great to all relax? And I think i think here, one one of my concerns, and this comes back to the copyright question, is actually that like,
00:57:47
Speaker
What this whole copyright fight has showed me, maybe more than anything, is that a lot of the people at the forefront of building this stuff, honestly, seem willing to trample on people's rights in the pursuit of personal gain and profit.
00:58:05
Speaker
Right? Like, that's, in my mind, basically what's happening here, they see an opportunity for vast wealth, vast riches. And they look at copyright, and they think, well, that's getting in the way, right?
00:58:16
Speaker
The, the people who, whose livelihoods depend on copyright, like the the the people who, who they're putting out of work, not really an issue issue, you know, for them. And, and that worries me, right? Because if at this stage, the people building this technology aren't going to respect people's rights, like aren't gonna take seriously when, when a whole chorus from an entire industry turns around to them and says, What are you doing?
00:58:46
Speaker
Why would it be any different? Why would they have any more respect for people why like, they're not paying copyright holders. like what Like, why will they ultimately pay anyone? how How is this money going to be distributed to other people if it's not being distributed at the moment? And so I don't i don't see the political will there to, you know,
00:59:08
Speaker
take seriously this idea that there might need to be that kind of mass redistribution, demonstrably the AI companies themselves don't have that will. So so that that's why, again, I'm not saying we should, you know, think we're all doomed. I'm not saying we think we we should just be purely pessimistic about this. but i But I do think we should take it sit when when highly funded, highly motivated, very smart people say, we're going to try to automate all labor, I think we should take them seriously, basically.
00:59:40
Speaker
What do you think the future of culture looks like? So what would be the long-term effects of having these technologies that can basically create replicas or copies of different styles? And yeah, when the price of generating text, imagery, videos, audio drops massively, what happens to culture?
01:00:06
Speaker
I mean, I think one of the most immediate... effects is that a lot more people find it a lot harder to kind of get into the creative industries, because a lot of the sort of long tail jobs,
01:00:24
Speaker
that would have supported them maybe through their early years will go. And that might be writing jingles if you're a musician, right? Like writing ad jingles. It might be writing production music. It might be doing some sort of copywriting if you're a writer.
01:00:38
Speaker
And and all these all these jobs are already on a downwards trajectory, largely thanks to AI. and And I think that's a major issue because I I think there is going to be a big blow to the creative industries and that will have knock on effects.
01:00:51
Speaker
um So I think that's going to affect culture. I mean, I think there is, we will also see, and there's a question as to how popular this will be, but I think we'll also see the see the rise of remix culture, like a further rise of remix culture.
01:01:05
Speaker
You know, i think it, at the moment, you know, you are not allowed, for the most part, to replicate people's voices or to take a copyrighted song and rework it into something else without kind of permission is very kind of can be hard to get that permission.
01:01:22
Speaker
um But I'm sure I mean, you know, media companies, rights holders, creators are open to licensing their voices, their likenesses, their music, their their works, they're open to licensing under the right conditions. And as soon as licensing gets done at scale, you will have the ability to, as a consumer, remix, right? And I think i think the the the barrier between being a consumer and a creator will, you know, start to not fully disappear, but at least weaken.
01:01:57
Speaker
You know, so you'll be able to say if you want, like, that new Taylor Swift song is cool, but can I hear Noel Gallagher s singing it? Right? Like, that'll be a possibility. of good Like, of course it will. You'll be able to have the AI model. As long as you get the licensing in place and the permissions in place, I think that's all going to kind of be possible.
01:02:13
Speaker
I mean, I think there's a there's a question, an open question as how popular that kind of thing will be. And I don't like I think a lot of people in tech assume that it's the future.
01:02:24
Speaker
And while I think it will be a part of the future, I i actually have quite high faith that concrete works, we can call them. So that's recorded music or a book in its final form or whatever it is.
01:02:38
Speaker
I have quite high faith that that will for a long time remain the norm in creative culture. Now, some of that will be generated in the first place by AI.
01:02:49
Speaker
But I think that ultimately... we are very attached to, we're very attached to the idea of these concrete works that we can all share that we can all talk about that we can, you know, that can be part of the public consciousness, in a way that I think like hyper personalized content won't be as exciting to people.

Cultural Impact of Generative AI

01:03:08
Speaker
um So I think that is a big reason that it will stick around. You don't think that consumers would be interested in, for example, talking to an AI, Taylor Swift, and then you know having a video call with her.
01:03:22
Speaker
She's playing music for you you. You say to her, I want to hear a different song. I want to hear more of this, less of that. Or you could imagine a book that expands in the sections that you're interested in.
01:03:33
Speaker
So so it kind of changes as you're reading it. But of course, then you don't have, as as you mentioned, the kind of the cultural common knowledge of what's in a work.
01:03:44
Speaker
and And would that be the barrier to to creating this more and interactive form of entertainment? Yeah, I mean, I think there are two things at play there, right? there's There's the ability to interact with like an avatar of ah creator.
01:04:01
Speaker
And there is the ability to have personalized experiences content in the in in the media form that that that creator is known for kind of written for you and i suspect that the fir the former will be exciting and the latter will be less so like i think that absolutely like the ability to and i don't know whether you'll be speaking to your taylor swift avatar or whatever but the ability in general i'm a big believer in the future of you know essentially the voice and language interface, and just being able to talk to your computer, to digital devices, to avatars, like, I think that's clearly the future, that's clearly the way things are going.
01:04:40
Speaker
So I think absolutely, people, consumers will probably want to, I can absolutely see music fans wanting to kind of have some kind of virtual conversation with their favorite artists. But I think ultimately, my bet would be that when they then say, I like, can you now play this, the this will be one of their songs.
01:05:00
Speaker
Like one of like, I don't see if you look at kind of some of the AI music generators, right, like, you know, one of the ways they advertise themselves is write a song about anything.
01:05:13
Speaker
Like, you know, and like, your trip to the coffee shop, or like, you know, your mom's birthday, or whatever it is. And in general, my impression so far, is that it's an absolute moment of magic the first time you hear it, you cannot believe it's possible.
01:05:32
Speaker
And then you never use it again. you know, like the the my my bets is the retention the retention figures for that kind of usage are atrocious because it's like people just don't really want that use case. There are use cases for AI music 100%.
01:05:48
Speaker
hundred so Right. But I think like personalized music in that way, with that kind of like, write me a song about X, I don't, I don't see as becoming like a very big part of music culture. I think that I think that the song as an entity, and as ah as a kind of thing that set in stone over time, will remain ah really, really important part of music culture. And I think that expands to all the arts with their own kind of these these forms set in stone.
01:06:18
Speaker
Do you think AI could change the winner-take-all scenarios we see in in music where we where you have basically the the most famous and influential musicians getting getting most of the plays and most of the views and so on?
01:06:35
Speaker
But if you if you had the opportunity or if you had the ability to create your own music, maybe you would see more of a kind of broad market? I don't think so, because frankly, because there's already a broad market like and this is one of you know, this is one of the things about generative AI, it's it lets you create music from scratch. But, you know, when we say kind of democratizes creativity, I mean, that's kind of only true to an extent, creativity is already pretty democratized, right? Like, you know, I mean, it's not perfect by any means, like it helps if you have rich parents and go to a school that lets you study it. And like, you know, there are all these things that really, really help.
01:07:19
Speaker
But ultimately, Anyone or like a lot of people can can write music, can learn how to write music, can produce music. like And a lot of people do. Like the amount of music, at before Generative AI came along, the amount of music out there being released every day like was absolutely astonishing.
01:07:38
Speaker
So there is no shortage of music. There's no shortage of options. And ultimately, what you what you still what you have, despite this you know this huge abundance of music, is you have a few people rising to the top. you know And why is that?
01:07:53
Speaker
I think, one, it's because, to an extent, it's inevitable in the kind of culture we have. people you know People get popular. And this connected to two, a lot of that doesn't come from like how this stuff is made. It comes from recommendation systems.
01:08:10
Speaker
you know there's There's kind of Spotify, a bunch of Spotify researchers in 2020 wrote a paper where they looked at one month in July, 2019, and they looked at listener data of like a hundred million Spotify users.
01:08:27
Speaker
And they found that when, and this is predictable, right? But they showed it with the data. When people listen to the recommended, listen to recommended songs, listen to the kind of recommended playlists, that kind of thing.
01:08:39
Speaker
The diversity of the music they listen to is far, far smaller than when they take what they call like user directed action. when they're When they're just going and searching for music, they find a load of cool stuff. you know, it's really diverse.
01:08:51
Speaker
When you go down these recommend recommendation paths, you know, everything kind of becomes homogenized, and you end up listening to the same thing over and over again. And so I think I think that's already a trend we're on, right? Like, recommendation systems through YouTube and TikTok and Spotify, this is already a path we're on. And so I don't think that ai letting more people We can say create, but like letting more people generate music from scratch, I don't really think does anything to affect those winner-take-all dynamics.
01:09:23
Speaker
Do you think there's a general effect of culture becoming more homogenized over time, where perhaps you will see in the future, most cultural products being having this kind of ah style, this AI style that's influenced by, say, how culture works in in Silicon Valley, and the values that are put into the models from there?
01:09:49
Speaker
I don't know. I mean, the flip side of recommendation systems like TikTok, you know, and I worked on the TikTok that my job was to work on the TikTok recommendation algorithm, the the flip side of these kind of models is that you know, you do get, you get different people ending up having very very different feeds. And now, you know, that can that can turn into filter bubbles, which isn't necessarily great. And you can go down some bad paths.
01:10:13
Speaker
But ultimately, like these recommendation algorithms, especially when you got short form content, and you're using them a lot, which most people do, they very quickly learn the kind of thing you like, right. And they show different things to you to kind of try to try to work out what that is.
01:10:27
Speaker
And so so that's why you have you know different pockets of TikTok, right? That's why you have like all ah all of these, you know, all of these different styles emerging. And, and so I think actually recommendation systems can, if they're constructed, right, take you down these good paths of discovery.
01:10:44
Speaker
And so I think in that sense, culture doesn't become too homogenized. Now that's that's always maybe going to be these kind of fringe niche communities, but I think it kind of is up to the, one of the exciting things about where we are, I think, is a lot of it comes down to the user. like to the put If you as a consumer want to go and find some interesting stuff,
01:11:06
Speaker
to listen to, to watch, to read, it's never been easy. You know, that that is that that's what's exciting. So I think we're in this kind of this interesting time where you've got these two extremes. On the one hand, if you just can't be bothered as a consumer, everything will be homogenized.
01:11:22
Speaker
And you'll just hear the lowest common denominator stuff. A lot of that will be end up being AI slot. It'll be awful. But if you can be bothered, you can go and find really great stuff. And the tools are there at your disposal to do that.
01:11:35
Speaker
And I think that, I mean, I think the the extension of all of this is that I strongly so I mean, look, the backlash to generative AI is just massive, you know, and I think I think people in tech still underestimate this, like they they underestimate the huge strength of feeling Against generative AI. and And, you know, this partly comes from the fact that artists work is being stolen to build it.
01:11:59
Speaker
And it partly comes from the fact that these companies are out competing artists. And it partly comes from the fact that people, you know, people consider generative AI to be kind of, you know, dumbing down their professions and art in general. And it's a whole host of reasons.
01:12:13
Speaker
But it's really strong. And this is, you know, people in tech like to sort of point to the introduction of the camera or the synthesizer and say, look, people people rejected recorded music when it came out. They thought it would be the death of music and it wasn't and people got over it.
01:12:28
Speaker
And they're looking at AI and they're saying the same thing. And I think that's a mistake because I think that the strength of feeling is much, much greater with AI. And I think what that's going to lead to is, you know, what I kind of think of as like a humanist a new kind of humanist movement in the arts, which I suspect will, will essentially entail probably a rejection, not just of AI, but also but also probably of, as a kind of result of that probably of some other technologies as well. And I wouldn't be surprised to see in music, for instance,
01:13:02
Speaker
a kind of humanist movement emerging that that that is maybe more acoustic in nature, that favors less production, that favors live music, acoustic instruments, things that a machine...
01:13:16
Speaker
couldn't do being there in person with someone, you know, these kinds of things I think will, and i I think there'll be, I think that could be a strong artistic movement that I would, I'd be surprised if it doesn't strengthen over the coming years.
01:13:28
Speaker
What people are searching for is perhaps a sense of authenticity in what they're consuming. that what they what they're enjoying is something that's coming directly from another person and not something overly produced and perhaps AI enhanced.
01:13:45
Speaker
I think so. I mean, look, at the moment, you wouldn't necessarily get that impression from maybe the press or like if you're on social media, just just because the you know the AI...
01:13:59
Speaker
chorus is so loud and so hard to avoid. You know, there is just so much money flowing around the AI ecosystem right now that, you know, it benefits people to become basically AI influencers who will constantly share content, who will sort of say, look, this is this is changing the world.
01:14:18
Speaker
Hollywood is dead. You know, everything will be generative in two years' time. And, you know, and mostly people do this because, you know, they'll get more followers that way, they'll make money that way, you know, it's ah it's all ah it's all a money making game.
01:14:31
Speaker
But ultimately, you know, among, i mean, there's a huge rejection of much of AI companies practices from musicians, for instance, I mean, you look at what we've been doing in the UK, you know, we organize like a protest album called Is This What We Want?
01:14:53
Speaker
And 1,000 British musicians kind of co-created, co-sponsored this this silent album in protest that the government's plans to give their music away to AI companies for free.
01:15:07
Speaker
And and he included these absolutely huge artists, right? Like the Kate Bush, Max Richter, like all of all of these people, like these people and And, and the same is happening across all the arts, you're getting some of the biggest creatives in the world, coming up pretty strongly against now not necessarily against all generative ai But it's against some of the very common practices that AI companies are, you know, utilizing.
01:15:33
Speaker
And i think inevitably, what that turns into is a movement to towards the authentic, and towards the the natural, and ultimately towards the human.
01:15:45
Speaker
And and i suspect I suspect that will get bigger and bigger. What would be the principles of this new type of ah humanism? What is it that' that's that such a movement would be trying to promote and and trying to reject?
01:16:00
Speaker
i mean, I think fundamentally at its core, any kind of new humanist movement would would essentially be about putting humans first. And that's a that's obviously very high level and vague.
01:16:14
Speaker
but i think Probably initially, it'll be more about a rejection of a few specific practices than it will be a you know specific creed or kind of set of guidelines. right like i don't and I should say, like i'm I'm not sure this movement like exists, but I just see it as ah as a broad kind of direction of travel.
01:16:39
Speaker
you know I think there are things that are clearly not, that would clearly not align with a sort of new humanist movements that let's say in the arts. And one of those is obviously...
01:16:51
Speaker
taking people's work and training models on it without their permission, when they're all on mass telling you they consider that theft. Like, that's not humanist, you know, i think building models that are designed to out compete humans in the creative spaces, for instance, would would not, I think, fall into this category, you know, now,
01:17:18
Speaker
i think it I think it would remain, you know, pretty, pretty vague and pretty, all pretty all encompassing. But I just think there'll be still be certain aspects of the current technological paradigm that will not be accepted by this, by this group. And I, and as I say, I think what you'll what you'll see is you'll see you'll see real time interaction between people, you'll see ah rejection of some of the most modern technologies and a reversion to traditional practices.
01:17:51
Speaker
And I think still, look, the the best songs ever written, you know, they were recorded on modern technology. You know, I don't I don't think that any new humanism would like, totally reject modern technology. But ultimately, you know, you take Yesterday by the Beatles, you can just play it on a guitar.
01:18:13
Speaker
You can just sit in a room on a guitar, play it to one other person. And I think that is humanism, right? It's that versus, you know, a song that was composed on GPUs by a model that's been trained on the Beatles' work without permission, and that then someone shoves onto Spotify in order to go and make money and take it away from the royalty pool of other musicians. that is That is the difference between these two movements, I think.
01:18:46
Speaker
One worry with such a new kind of humanism would be that it would face, this set of values would face face competitive pressures from other groups, from other companies from other countries and so on.
01:19:02
Speaker
And so perhaps a humanistic approach is not the most efficient approach. And therefore, it will fade away just because people, as you mentioned earlier, people consumers will grab what is best and what is easiest.
01:19:17
Speaker
And there will be there will be demand for whatever is most efficient to produce what it is that consumers want. Yeah, I think in general, that's true. But I actually think in creativity is not necessarily true.
01:19:30
Speaker
Creativity isn't all about efficiency. You know, the the the most, you know, widely loved songs and films in the world in the world were not made in a manner that, you know, where efficiency was the was the bar.
01:19:46
Speaker
That's not what people are going for. They're going for creating something great.
01:19:50
Speaker
And I think that, you know, ultimately, when, I mean, again, when countries have all these, you know, you you look at the AI race, you know, and you look at people, you look at countries like worrying about, will China get AGI before us, whoever us is you know, and what does that mean?
01:20:10
Speaker
You know, these... politicians who are who are worried about that, I don't think are really thinking about creativity, right? They're not thinking about the creative industries, like, in all of this stuff, like, no, what like politicians don't mind that much, if the next AI music generator is built in their country.
01:20:31
Speaker
Right, like what when they are looking at, when they do come up with, like for instance, the UK at the moment, the UK government is just like all out favoring AI companies. It's like, you know, I mean, it's it's astonishing. And this is that this is a party that is meant to stand for, it's called labor, right? Like it's meant to stand for working people.
01:20:48
Speaker
And it's just basically in the pocket of big tech companies. The reason they're in the big in the pocket of big tech companies, I think, is not because they are desperate for the next AI music company to be built in the UK, or the next AI image company to be built in the UK. Frankly, I think they'd probably rather it wasn't, right? Like, they do value human creators, they certainly don't want human creators work to be stolen in the way that these companies are stealing it.
01:21:11
Speaker
You know, what their what, I think their primary concern is AGI. and and that's ah And that's a very different thing. And they don't want to be left behind. And that I think is driving a lot of politics. so So I don't think efficient, when you speak about creativity, I don't think efficiency really comes into it and is what is driving decisions or ultimately driving consumers decisions as well. Like I don't think it's,
01:21:35
Speaker
you know, you're never going to listen to a piece of music because it was made efficiently. Like you'll have an AI influencer being like, this is wild. Like 10 songs I made in under 10 seconds.
01:21:45
Speaker
You'd never tell they weren't human. Cool. You'll get like 50 retweets.

AI, National Security, and Cultural Influence

01:21:50
Speaker
But no one's ever going to listen to that song again. No one's going care. And I think that's good. Yeah, I agree that when politicians worry about competition with China and so on, they're thinking about national security, they're thinking about autonomous weapons, ah think about AGI and superintelligence, and perhaps not as much about generative AI.
01:22:12
Speaker
But, I mean, depending on the scenario we're in, if we're in a bit of a longer timeline scenario, it might be the case that having control over culture is it's a kind of soft power in the world, as it has been, i think, during the 20th century, for example.
01:22:32
Speaker
what is Do you think countries will seek to influence how ah culture is produced through AI with with the goal of kind of projecting soft power?
01:22:45
Speaker
I don't know. I mean, I don't i don't immediately think so. You know, I mean, I think that, again, this is actually why I think that it's so crazy that people like the UK government are so strongly considering basically upending copyright law to favor AI companies and to penalize creators.
01:23:04
Speaker
You know, because... you're not, what you're going to do by that is you're going to increase, you know, you're basically going to increase the amount of slop out there. You're going to eat into real creators' royalty streams.
01:23:18
Speaker
You know, you're going to eat into the royalty pools. You're going to make it harder and harder to build a career as an actual creator. You know, you're that's basically what you're going to do.
01:23:30
Speaker
you're going to undermine what a what what are very strong industries at the moment, I think I think soft power through kind of the creative arts, I suspect will largely continue to come from where it does at the moment, which is from having supremely talented people backed by great industries, you know, and I think I think that's what countries like the US and the UK have at the moment.
01:23:55
Speaker
And I just I just think that all the, you know, ever anything you do to undermine that is is probably a very bad idea. So that that's kind of where I come out it. But I don't know.

Protecting Creative Industries

01:24:06
Speaker
As a last question here, what are your priorities, say for the next few years with Fairly Trained? what do What do you find the most important to do? Like honestly, right now, i think the creative industries and the people who make them up, the creators, the rights holders, this is a group I care a lot about.
01:24:28
Speaker
I'm a member of it myself, but like my entire career has been kind of working with this in mind. And and I think that what they face right now is like an existential threat to their industries to people's ability to ah make money from being creative and therefore to the art that we all consume.
01:24:51
Speaker
Like I think that I think that this is the biggest threat that these industries have faced in certainly in living memory, and probably going back a long way before then.
01:25:02
Speaker
trying to shift the Overton window, I guess trying to try to move the needle towards any kind of outcome that is fairer than the current circumstances for creators, I think is really important.
01:25:14
Speaker
and And I think there's a bunch of ways you can do like strategic, there's a bunch of strategic questions about how you do that. Like we're trying to do that with fairly trained by showing the kinds of models that that that are possible without theft.
01:25:26
Speaker
You know, there are lots of other ways you can do this as well. but marketplaces of training data. There are great companies building marketplaces of training data. There are people building public domain data sets, or people doing all this stuff to, to so make it easier to train models that aren't based on theft, right?
01:25:44
Speaker
There's, there's people working on ah trying to kind of disentangle the I guess the or shine a light on the black box of these models, such that you can start to understand which training data has, has kind of has, has gone into a particular output, more or less, there are people doing all these kinds of areas of research, I think all of this is really important.
01:26:07
Speaker
But I mean, in general, I think if we if we continue on the if if AI companies win, God forbid, all of the lawsuits in this space, you know, and And or it becomes like settled law that you're just allowed to take people's work to build AI models that will compete with them.
01:26:28
Speaker
You know, i think that is a terrible world for creators. And I think, you know, there are lots of people, myself included, basically just 100% focused on, you know, how do we try to kind of just gently guide the path ah in ah in a slightly different direction.
01:26:45
Speaker
I don't think we're going to stop AI development. You know i don't think we necessarily should stop all AI development or, and i i you know, there are people who say ban AI from the creative industries. I think that's unrealistic.
01:26:57
Speaker
And i I also don't think that's the right approach. AI is a very broad field. But I certainly think we don't want to end up where where the where the big tech lobbyists at the moment want us to end up.
01:27:10
Speaker
Yeah. Ed, thanks for chatting with me. Good to chat. Cheers.