
Ed Newton-Rex joins me to discuss the issue of AI models trained on copyrighted data, and how we might develop fairer approaches that respect human creators. We talk about AI-generated music, Ed's decision to resign from Stability AI, the industry's attit
Loading summary
A
It said, we think that training on people's copyrighted work without a license is fair use. And that just goes against everything I stand for. No one trained on copyrighted work without a license for a very long time. The commercial model, everyone knew it was illegal. What this whole copyright fight has showed me, maybe more than anything, is that a lot of the people at the forefront of building this stuff honestly seem willing to trample on people's rights in the pursuit of personal gain and profit. Trying to shift the Overton window, trying to move the needle towards any kind of outcome that is fairer than the current circumstances for creators, I think is really important.
B
Welcome to the Future of Life Institute podcast. My name is Gus Ducker and I'm here with Ed Newton Rex. Ed who? Welcome to the podcast.
A
Hey, great to be here.
B
Fantastic. Could you tell us a little bit about your background?
A
I am a composer, a classical composer, and I've worked in AI for a long time. So when I left university I started an AI startup, what we'd now call a generative AI startup. But this is in 2010, so we didn't call it that back then. We used to call these things creative AI startups and it was a music generation startup. This was long before the invention of the transformer. I mean, I started off by hand coding rules to compose music and eventually we replaced that with recurrent neural networks and we kind of built it out, but it took eight or nine years. I'd say we were probably about 12 years too early to the generative AI trend. We ended up selling the company to to ByteDance, the owner of TikTok. And I went there and I took on a product role working on the for you feed, which was very interesting, a totally new kind of thing. And I ended up kind of going via Snapchat as well and ending up at a company called Stability AI in the uk, which is a big AI company in the uk, running the audio generation team there. So basically kind of doing what we'd done at Dukedeck, but 12 years later.
B
How had the tech improved in those 12 years? So how was it different working at Stability?
A
It improved leaps and bounds. When we were doing this in 2010, really we were making things up as we went along. You know, generative music had been around for a while. I don't think there had been any startups in the space, but people have been working on it academically since like the 1950s. But you basically had things like rule based systems, you know, classical AI, Markov chains, hidden Markov models, these kinds of things. And the tech was really rudimentary. You know, we were basically composing music note by note in a symbolic fashion. So we created the notes and the chords. We then used like an automated production system we built to produce that, to turn it into audio. It was basically the back end of a digital audio workstation that we built. But so the actual AI element was symbolic. It was creating basically notes and chords on a page. And that's a totally different approach from the cutting edge models today. And when I joined stability, which are generating audio samples, they're generating raw audio, which is a, you know, which means you have a load more variety that can come out of these outputs. I mean, they're much, much more powerful systems now. And that's just the nature. That's in the nature of these models getting basically getting bigger, new innovations like the Transformer and other things, diffusion, all of these things coming along. So, yeah, I mean, it's a totally different world. But what's interesting is that the products, you know, the product visions aren't actually that dissimilar. And what people are doing now with AI music, you know, I mean, is really, you know, it's the same kind of product that we were trying to Design back in 2010. It's just the tech has got a lot better.
B
And for listeners who haven't heard AI generated music, I can say that it's gotten incredibly good. Just. I'm not a musical person, but I have been fooled several times from listening to an AI generated song and thinking that this was actually produced by a team of humans, basically. So it's gotten really advanced.
A
Yeah, I mean, AI music has come on a long way. It is now. I think it's reasonable to say that it's, in many instances, it's pretty indistinguishable from human composed music. I think lots of people wouldn't know the difference hearing a couple of new songs if they didn't know the artists involved. It's not necessarily true in all styles of music, but it's pretty true. I mean, interestingly, it's still not true in classical music, where. Which makes sense, right? I mean, these models are optimized to, you know, to basically create music that's popular now, to create pop music, you know, but, you know, pop music, including with vocals, including with lyrics, you can generate really convincing stuff now, which. Which, you know, is. Is already out there in the market competing with people with human musicians, which I think is a big problem. So, yeah, it's pretty indistinguishable, but yeah, but classical music is, you know, it turns out, harder. I mean, the intricacies of, you know, of something like a huge work by like JS Bach or something right, from hundreds of years ago, this is, this is not yet doable. So maybe, yeah, I don't know, classical music somehow is maybe safe. But for most of the market right now, AI music is absolutely here.
B
Why is it that we still have human musicians then? If it feels like AI generated music is indistinguishable from human produced music, why hasn't this kind of wave rolled over the music market yet?
A
Well, it's still very early, right? I mean, even though there are people like me who've been in this field for now 15 years, you know, in general, the generative AI wave, you know, really capturing the public consciousness, started in 2022. You know, AI music entering the public consciousness really started probably right at the end of 2023 and into 2024. So we are very early. And a lot of people aren't noticing the impacts are there, but they're invisible to a lot of people at the moment. Right? So, for instance, already there are reports coming in from around the world of AI music being used in huge quantities to basically replace human composed music in stores, retail outlets, you know, and these are, you know, these are foreign countries. You know, these are places that, you know, maybe I haven't visited. People thinking about this maybe aren't visiting that much. And it's, it's happening already a huge amount. So I think, I think we're in the very early stages and we don't yet have the reporting that I think we will have, you know, that will actually show the extent of what already in mid-2025 is going on. That's one reason. I mean, I do think there's definitely another reason, which is that human musicians will survive, right? Like a category of human musician will be fine. And I think that is the. Everyone's going to be affected to a degree. But to put it very simply, Taylor Swift will emerge from the AI age relatively unscathed. But the problem is most people are not Taylor Swift. And so I think there's this argument from big AI advocates, and I'm not someone who is against AI per se at all, but I'm also not what I call an AI booster, someone who just relentlessly tries to elevate any and every AI advance. And there are a lot of those people at the moment. Right. You know, I think, I think an argument that a lot of these people put forward is like Taylor Swift's going to be fine. And, you know, musicians are going to be fine. There are still going to be pop musicians and of course there are. People are still going to want to go and see live music, they're still going to want to connect with these musicians.
B
But.
A
But the issue is already that a lot of the long tail of how musicians can make money, and that is actually, you know, it's obviously the huge majority of musicians who are not household names. It's a massive problem for them already. It does also affect the household names to an extent in the long tail of how they make money. And it's these hidden areas of the music industry and the creative industry more generally, which are a massive, massive part of those industries where you're already seeing the rug being pulled out from under people's feed. And I think that is the issue.
B
Why is it. What are the reasons that Taylor Swift is going to be fine?
A
We love human connection in the art we consume. That is particularly true in music. It's more true, I think, in music than it is in literature, where we immerse ourselves in the story. And often a lot of people don't really, for better or worse, don't necessarily think about who the author is very much while they're consuming it. When we consume music, when we go to a Taylor Swift gig, right? And I mean, I confess I haven't been to a Taylor Swift gig, but I'm going to see Oasis, who are reforming, I'm going to see them in London in August. You know, it's the human connection. No one would be remotely interested in robots playing an Oasis concert, right, Like. Or, you know, and you'll get, you know, you'll get a few kind of AI artists, I think, who kind of get big. You'll get as sort of a circus sideshow. It'll be kind of interesting, but I don't think there's any risk of AI sort of fake musicians taking over the charts, per se. But again, what is going to happen and what is already happening is that a lot of the musicians who contribute to those songs going into the charts will find themselves out, competed. I think a great example is songwriters, where already songwriters. I've done some songwriting sessions. I mostly write classical music, but I've done some songwriting sessions. You all get in a room together, maybe a few songwriters, you might be with the artist, you might not be. You're kind of improvising, you're writing songs. Ultimately, you'll write, you know, people will write hundreds and hundreds of songs just for one album. So Many song ideas and full songs get rejected and already artists are starting to. Not all artists by any means, but I've heard stories of artists starting to turn to AI song generators, you know, because, you know, it's just kind of easy for them and cheap for them to just go and get song ideas. And so you get to a stage where no one's going to be. There's going to be no kind of audit trail there. You won't know it's happening. It's almost certainly already happening. What you have is you have songwriters not being fully put out of work, but gradually their work being eroded away. And I think that's where, again, it's not. We'll keep filling Wembley, We'll. We'll keep filling these stadiums. People want to see pop stars, but most musicians are not pop stars. That's the fundamental issue.
B
Yeah. We should talk about why you decided to resign from stability AI.
A
So, I mean, I was at stability in 2022 and 2023. I was really excited about building out and building out the Audio team, releasing Stable Audio, which was. It was an AI music generator that we released in, I think, September 2023. It went down very well. We licensed all of our training data, and so I've built a whole bunch of AI music systems over the years. But key to all of them has been, you know, if you're using people's work to train on, to train these models on, you pay them, you figure out a deal that works for them, you ask them permission. And that's what we did for Stable Audio, which I was really proud of. You know, I think it was the. At the time, it was one of the kind of first big, you know, kind of what I'd call contemporary generative AI models that was trained on data that was fully licensed. Unfortunately, the wider company, and frankly, the wider industry showed no signs of following that lead. And, I mean, I like to say that I didn't really resign from Stability so much as I resigned from the wider industry, because Stability were not the only company taking this approach. But it was. It was them I was working for. It was in October, I think, 2023, and I woke up and I read an article, I think, in the Verge, and it was all of these AI companies responding to the US Copyright Office. The US Copyright Office had just put out this request for comments on AI and copyright. And interestingly, I mean, actually just a few days ago, they're at their final stage of their report, finally came out, and we should talk about that. I think it's a great report. But this was when they were gathering evidence and they were asking for submissions and all of these tech companies made these public submissions and there was a list of them. And I saw in this article, hey, Stability is listed. And I thought, okay, I'll have a look at that. You know, and I was on the, you know, I was on the leadership team at Stability and I, and I read it and, and basically, I mean on the first page it said something to the effect of we think that training on people's copyrighted work without a license is fair use. Basically they think there is this exception that covers it and that just goes against everything I stand for, everything that I had stood for in the audio team, everything I stand for in general. And so that was kind of the trigger, this kind of public statement. I mean, Stability had been training their image models for a while and I knew the attitudes of the rest of the company, but honestly I had hoped that by building a model that went down very well. I mean, our audio model was named one of Time's best inventions of 2023. I mean it was a really good, I think it was an industry leading music generation model at the time and it was licensed and I hoped that by showing you could do that. And I still truly believe this. I think most of the reason you don't see leading models trained on licensed data is because people just can't be bothered. They're leaning on the fair use defense. So you don't have, and the US Copyright Office said this basically in their report, they said, look, licensing is hampered because so many people relying on this fair use defense. Obviously if you've got a whole industry that's copying a few big players who rely on this fair use defense, who refuse to go and license all their training data, obviously licensed models are going to suffer as a result. And that's, I think what's happened.
B
Would you say the general industry attitude is just that, hey, we can train on copyrighted material and this is covered under fair use, we don't have to license anything.
A
Yeah, that's absolutely the standard industry attitude right now. I mean, it's really interesting, right, because I've been in, I've actually been in the industry longer than probably almost anyone. And so I saw it develop throughout the 2010s and through into the early 2020s. And no one trained on copyrighted work without a license for a very long time trained a commercial model. Everyone knew it was illegal, it was standard common knowledge. It's interesting if you look at Google, Google had this fantastic team a little bit after we publicly launched our AI music company. We launched in 2014, I think in 2015, Google launched this team called Magenta. Really cool project run by, like, run by some really smart people. And it was basically looking at creative AI, as we called it back then. And they trained these generative AI models. Honestly, when they, when they launched, we were super worried about it. We were like, oh my God, we got competition. And we needn't have worried at all because, like, we were all a decade too early. But, you know, they launched these models, they wrote papers about these models, they wrote blog posts and they call out in these papers and in these blog posts like, this is our data. Here's where we've got it. We've gone and commissioned training data. They built this AI drummer. They went and commissioned all these pieces of training data. And that's what we did as well. We commissioned training data. And obviously you compare that to Google's approach now to generative AI, which is obviously very different. And so I think what happened is in 2022, you'd had all these, this is my impression, you had all these research models. People were researching. And there's always a better argument for using copyrighted work unlicensed. If you are not doing anything commercial for research, I think there's good, especially in academia, I think there's a good argument for that. And so people did this research, they found these things worked really well. And then two or three companies that are now the most famous companies in the world through Caution to the Wind thought, let's just release this, let's see what happened. And you know, the story from there that the industry took off and everyone copied them immediately. There was this gold rush. Everyone copied them. Everyone saw them relying on fair use. Everyone assumed, yeah, well, this must be. I mean, they've raised billions of dollars. They're one of the most valuable companies in the world. They're, they're showing what can be done. Like, you know, if they can get away with it, surely we can too. And it's rapidly become the kind of standard approach. And it's, it has. I mean, it has massive issues for people. I mean, I know a bunch, I mean, through my work, through, through fairly trained, through just my work in general over the last few years. You know, I know a lot of people who run AI companies that are trying to take a fairer approach. You know, companies that are licensing all their training data, that are really working with creators. And, you know, every, every company says, like, we're all about, you know, we're all about creators. We want to democratize creativity. Like, you know, we want to treat them well. But with most of these companies, it's garbage. But there are a few who are actually licensing their training data and that's what creators want, right? They want to be asked permission before their work is used. But they are having a really tough time. And one of the reasons they're having a tough time is because they're going to try to raise money to raise capital. And investors are looking at their pitch deck and they're stroking their beards and they're saying, well, hang on, your expenses are going to be higher than the people who are taking their training data for free. So you're not going to win, so we're not going to invest in you. And so you have this horrible cycle where it's not just the AI companies who are basically, in my view, stealing all this work and training on it. You've got a whole industry around it, and you've now got this whole industry that is desperate for fair use to prevail in these lawsuits for the AI companies to prevail, because if it doesn't, they're worried that the whole thing falls apart. So we've got into a really bad position, unfortunately, and we didn't have to. It didn't have to go this route, which is what's so annoying.
B
And we should say, if your company is trying to produce a licensed model, you're also competing with open source models that are also trained on all of the data available on the Internet and some data that's not even publicly available. This is not unique to music. This is also images, this is text. We're talking about books and articles, videos, movies and everything is all of the top companies have collected all of the data available and they're now training on it and they're now producing synthetic data from that data. So they're doing everything they can to gather as much data as possible.
A
Basically, yeah. And agreed. I mean, I think the open source thing is interesting, right? Because open source, I mean, for a start, lots of these models obviously aren't open. They're not open in the traditional sense because they're not revealing their training data and they're not revealing their training data because they know they'll be immediately sued into oblivion if they do. But open weights models, I guess we can call them, I think they're interesting as well because there's this. In general, I think open source obviously has benefits. Open source has led to a lot of Great innovation. In this context though, there's this almost kind of religious adherence to this idea that open must be good. And what you have with a lot of open source models is you basically have companies or organizations going out there training and releasing an open source model and sort of arguing maybe externally, certainly internally, that because it's open sourced, there's less reason to license. Maybe they're not commercializing it, Maybe one of these open models they're not directly commercializing. But I think this is incredibly misleading for a couple of reasons. One, often the companies that are really invested in open source, in building open source, are doing it commercially. They may not be charging for the models, but they're absolutely doing it commercially. They're doing it so they can, because they're massive trillion dollar companies and so that they can attract, you know, the best engineers in the world who want to work on open source. They're doing it so they build out their ecosystem, the ecosystem around their products, their models. I mean, it's 100% a commercial thing. It's not just some like philanthropic exercise, which I think is important. And secondly, truly open models, of course can be used for anything. On a truly open model, there is no downstream limitation to how the model can be used. And that then throws a big spanner in the works for fair use defences. Because as the Copyright Office made clear just this last weekend when they put out their report on AI training, how a model is ultimately used comes to bear on the question of whether it is a fair use of the data that trained it. You can't just sort of train a model and say, well, look, we're training a model, we're not creating music, we're training a model. Other people are creating the music with the model. That doesn't fly. It's about what the model is, it's obviously about what the model is used for. And an open model, you can't put restrictions in. If you try to build in guardrails that won't output, the copyrighted work that you trained on, those guardrails will just be removed. And a lot of this comes down to potential as well. It's not with fair use. It's not just about what is actually being done, it's about what are you potentially facilitating. And so I think a lot of the fair use arguments have a lot of trouble with open models. And in general, I am really wary of open models for creators. And as regards creators, part of the reason being that open models are irreversible. You can't take about a closed model you can turn off. I strongly believe that, as the U.S. copyright Office said, basically, look, we think that some AI training is probably fair use and some isn't. So I think we should expect that some rights holders lawsuits will be successful and some won't. Fair use is determined on a case by case basis. You'd expect that different cases would go different ways. So therefore you should expect that some AI companies are going to have to turn off their models. They're going to have to retract them. Right now, a closed company can do that, a closed AI company can do that and can turn it off and then it's not accessible anymore. You've got an open model out there, there's nothing you can do to get that back. You can make use of it illegal, but it's going to be very hard to police. So, yeah, I mean, while I am a big advocate of open source in some areas, I think it has real. I think there are real issues with open source and copyright.
B
Basically, you've launched this organization called Fairly Trained, which is trying to set a new standard for the industry. Could you tell us about what you're trying to do here?
A
Yeah. So Fairly Trained really came out of conversations I had kind of immediately after Leaving Stability, where it kind of ended up blowing up a bit in the news, I think a bit. Because while a lot of creators who I really applaud had been flying the flag for creators rights and trying to shine a light on this issue, most people in the AI world had been pretty silent on it. And so here was someone from the AI world who was saying, actually, no, this is not legit. We should not be just stealing people's work to make money off it. This is terrible. And I think because of that it got a bit of news coverage. And one of the things I found was that journalists were sort of saying to me, that's interesting. I thought AI could only be built by stealing people's work. And I was kind of shocked. But there's no reason they would have known otherwise. A lot of the handful of models who were doing things legitimately were not that well known. As I say, they'd been struggling to. Some of them have been struggling to raise money. Not all of them, but some of them have been struggling to raise money. It's hard, it's hard to take the right path. And so I thought, well, we should do something about this. We should highlight the fact that there are these companies, we should try to help them. We should also try to help people understand that this is a viable option. You don't have to use plagiarism machines. You can go and use models that are built fairly. So that's where it came from. And so, you know, the idea that we landed on was just a very simple certification for AI models that aren't trained on copyrighted work without a license. And so we have a certification process that these companies go through. I think we've certified 19 to date across a range of modalities. There's music, there's voice. We've actually got one large language model. There are other modalities, and that's the purpose. And so we've got like, you know, there are some companies who have said, look, we as companies are only going to use AI models that are fairly trained or that at least meet this bar. And I mean, I don't care if people. I mean, Fairly Trained is a nonprofit. I don't pay myself. I'm not in it to make a big success of Fairly Trained. I mean, it's try to elevate these companies. And so I don't mind if people take certification badge as gospel or if they do the diligence themselves and make sure these companies hit the same bar. I don't mind at all. So some companies do that. Some use our certification mark, some just have their own diligence, but try to hit our criteria. And these companies are basically saying, look, if we're going to use AI as a company, we're only going to use fairly trained models. And I think that's really good. But at the same time, we're not going to affect the public's feelings on this. The public are always going to use just the best and easiest model that is available to them. And I don't fault people for that at all. I mean, before there were legal ways of streaming music, a whole bunch of people used Napster and the like, right? Like, you know, if it's, if it's easy, you're just going to kind of go use it until you know about the really viable alternatives. Right. And so, yeah, so, so, so I don't think we're going to affect the public consciousness that much, but I think that's okay. Like, I think that what we provide is something that people and companies who care about this can turn to. And so we try to gradually change views that way, but also something that legislators around the world can look to and just point to as an example that this is possible. That's one of the things that kind of honestly frustrated me the most. Two years ago, one and a Half years ago was that there was this idea for generally kind of put forward by AI companies that like it was impossible not to do what they're doing. They have to. And I hope that what we're doing at Fairly Trained shows that that's not the case.
B
How is it possible to certify the data that goes into training a model? What are you doing at a technical level?
A
Well, I mean like many certification schemes, it is, I guess you could describe it as a self service certification scheme. I mean it's not, they're not, these companies aren't doing it themselves but it's kind of based on, they provide information to us. There are ways you could technically scan data sets, but it would still be based on trust if you did that because you'd have to trust that the company was giving you all their data. There is at the moment no way of actually scanning kind of a, you know, taking a model and obviously and just reverse engineering. Here's a list of all the data. So we can't do that. That's off the table. So we have to have some trust based mechanism. So what we do is we have a process where companies submit a bunch of information in response to questions that we give them, that we pose them. They submit lists of their training data and then we go and check that. And it's like sometimes it's very easy because there are companies that, there are companies we've certified that aren't for instance, large language models and don't use a ton of different data. And sometimes it's difficult and there's a ton we have to go through and we have to go and look at all the sources and we just drill down and drill down until we have a high level of confidence that these people have that this data is clean essentially. And so that's how we do it. And there are other parts of it. There are other parts of the certification like having good internal processes to make sure that this is, you know, these, these standards are met going forward, that sort of thing. But the crux of it is what is your training data. And so that's how we do it.
B
Do you think a certification process could ever scale to some of the big players? OpenAI, Google, DeepMind, Anthropic and so on?
A
Yeah, I think so. I mean, look, fundamentally the, the biggest issue, the biggest thing stopping that is transparency. You know, is the, or a lack of transparency. Like ultimately if these, if these companies would just reveal their training data publicly, then of course you could check it. It might take time depending on the kind of data they've used, depending on where they've got it. It might take time. You, but you could automate parts of that relatively easily. The, the check. There is no issue with the checking side. The issue is in the disclosure and the lack of disclosure. And there is, you know, there's this, there's a fight going on in various places around this right now. I mean, you know, in the uk, where I'm from, the House of Lords has twice proposed to the Government like really simple additional bit to a bit of legislation that just basically says AI companies must disclose the training data that they use. Right? That's basically it. Which to me seems like common sense. AI companies argue that it's kind of their secret source, but it's not. The sources of your training data, maybe what you do to the training data, there are some trade secrets there, right, like how do you augment the training data, how do you filter it, what do you ultimately choose to use but the sources like where you've got it, that's not a secret. I mean, for a start, everyone is just getting as much as they can, right? There's no secret to that. Secondly, it's trivial. For instance, if you run an AI music company, it's absolutely trivial to come up with a list of really all of the people you could license music from. All of the big companies you could go license music from. I know, I've done that. You can do it in an afternoon. There is no secret to where you get data. I don't buy the argument at all that it's like a trade secret, like what your training data is. I think it's obviously these companies are saying that because they know that if they reveal their training data they get sued. That's what would happen. And I think a bunch of people would win those lawsuits. So that's why they don't want to reveal their training data. But that's all it would take. And so in the uk, the House of Lords has proposed this a couple of times, the government in the UK has rejected it and using arguments that are basically based on procedure. But really the reason they're rejecting it is clear. It's because they are very close to the big tech companies. They want the big tech companies to keep opening up hundred person offices in London and whatever, boosting job count a little bit. And I wouldn't go so far as to say they've been bought by tech companies, but clearly they place US tech companies interests over their own creative industries and their own country's creators and so they're rejecting these amendments to bills that would literally just make AI companies fess up to what they're training on. I mean, that's all it would do. And you've got to bear in mind as well, that training on copyrighted work in the UK is straight up illegal. There's no fair use debate, there's not even debate around this. You just can't do it in the UK at the moment, which is great, which is good law and shows the strength of our copyright system. So, yeah, I mean, there are big fights going on around this kind of stuff.
B
I agree that if the companies showed what they had been training on, it would be revealed that they had been training on copyrighted data. But I also think there might be special cases in which the companies have paid scientists, say, or researchers a lot of money to produce very valuable training data that's not publicly available, or they might have generated a bunch of synthetic data that's also difficult to produce. So how would you. Would that fall into the category of more of a trade secret? Would that be the secret sauce that they can't reveal?
A
No, I mean, I think there's a difference between actually revealing your training data, as in sharing the actual data in like an S3 bucket and letting people go through the actual words you're training on, or whatever it is, and lists of training data. And I think that's a critical difference. I wouldn't suggest, like, if you've gone and commission and we do this with fairly trained. Right. Like, we don't ask to see all of the data that you've commissioned. What we ask is we need to know that you've commissioned that data. Like that, like, here is our data, we've commissioned it. You don't have to show us all the words that are in that data. You don't have to, you know. Yeah, that's totally, totally fair enough. I can see why that would be secret. There's no reason people need to know that. But ultimately, that's not the argument that is being had. It is like, should we have transparency over lists of training data? You should basically be able to. This is the key. The copyright holders should be able to. They need enough information that lets them know if their work is in the training set, that, like, that's what they know. And at the moment, they don't remotely have that. Right, because, like, there is just no transparency at all. And they have to go and, like, they have to go and red team the models to try to find out. And it's hard work you know, and it's. And it's. And obviously that just like stops them exercising their rights. So. Yeah, I totally agree with you. I think they're absolutely. And the same with the same potentially with synthetic data. Synthetic data is interesting, right? Because synthetic data itself, I think, can be a way of laundering copyright. If you use a model that's trained on a ton of copyrighted books and then you create a load of synthetic data and then you train a new model on that synthetic data, in my mind, you are infringing copyright just as much and you're doing as much harm to the authors as you would be if you were just. If you cut out the middle step of the synthetic data training. So when we are fairly trained, when we evaluate synthetic data, you have to meet the same criteria with the whole chain of models that was used to create that data. So you can't just sort of wipe your hands of it and say, well, we only use synthetic data because we say, where does that synthetic data come from? So I think that should be included. And actually, I mean, that's a major issue with a lot of the transparency regulation that has been proposed. I think in general, legislators have missed the synthetic data problem. I think it's always missed. They will say, provide us with a list of the copyrighted works that you've trained on. And I don't think that's nearly enough. We need a list of the training data, because then if a bunch of that is synthetic data, attached to that should be an explanation of the models that came from and similarly the training data that went into those models. And if you didn't train that model yourself, which you might not have, you at least need to tell. You at least need to disclose what the model is. Again, the ultimate test should be can a third party looking at this list you provide reliably go and check it themselves and find out if their work is anywhere in that training stack. And that that should be the. And anything short of that, I think, is not good enough. And I think that should just like. It's a very simple bar to try.
B
To hit when companies provide a list of the data they've trained on. Wouldn't it be fairly easy to simply exclude data that they don't want anyone to know that they've trained on? So you can't prove a negative. You can only look at the data or the info they've provided on the data they've trained on. But they might be running another training run using a whole bunch of copyrighted data. How would you deal with that issue.
A
Yeah, I mean, I think you deal with that in two ways. I think you do, like ultimately at the society level as well. And when this, if we actually get transparency legislation, then I think as part of that you ought to have audits. I think audits are key to that kind of legislation. And then I also think that you have. This is where red teaming can come in as well. Because I mean ultimately if you say you haven't trained on Harry Potter and people can get Harry Potter out of your system, then you're obviously lying. Right. So I think a combination of audits and red teaming can solve that issue pretty well.
B
On the point of synthetic data, do you think it might be too late to fight this fight? If we have open source models that can generate fairly high quality text and images and potentially music that can then be used to train other models? So in some sense the cat might be out of the bag because you can't, as you mentioned earlier, you can't remove these open source models from the world.
A
Yeah, I mean, I don't think so. I mean AI companies, obviously that's an opinion that AI companies love. Right. Like it's too late, don't bother regulating this. It's too late. I don't think so. For a couple of reasons we are. One, yes, there are open source models out there, but a bunch of them are almost certainly, in my view, a bunch of them will be found to be breaking the law in how they're built. You can't take them back, but you can forbid people from using them. You can make it illegal to disseminate them, you can make it illegal to host them, you can make it illegal to use them. It's not going to stop all use, but honestly it's going to stop a lot of the use. Right. Like you solve a lot of the problem by just make, by saying no, you can't host that. Most people, most people don't want to break the law. Some people do, but most people don't. So I think that's one reason it's not too late. Another is this kind of current issue of model collapse. Now I actually am not really very bought into the idea of model collapse. There's kind of a hope among, I guess the sort of anti AI crowd, which I don't really consider myself part of, but I know a lot of my views on copyright align with a lot of what I call the anti AI crowd's views. And so I know quite a lot of them. And there's this kind of hope that you could never train a model on purely synthetic data because it would lead to model collapse. Basically, it just all becomes like it's just not good enough and it leads to the model doing really badly. And there are some signs that that's maybe true at the moment or has been true recently, but I don't see any reason why it would be true in the long term, why it would hold as a general rule. I think it's similar to, I mean, frankly, 10 years ago, 10, 12 years ago, no one believed you when you said AI will one day be able to create art and write music and text in a way that is as convincing as people. No one believed, they thought you were crazy. They thought, well, there's no way this could happen. And they were wrong. And I think that model collapse is another thing like that. It's a very temporary. To me, it seems like a very temporary limitation. So I do think that we'll get to a stage where you can train very highly performant models purely on synthetic data. I think that's likely at some point. There are already signs, signs it's possible. So I don't think sort of we should rely on model collapse as like a get out of jail free card. And it's another reason why I think like rapid regulation is important, like you know, rapid holding people to account. So I don't think it's too late, but I do think time is of the essence. You know, I absolutely think that time is of the essence.
B
Yeah, on the model collapse point, I mean, we shouldn't rely on the intrinsic features of a technology to kind of cross our fingers and hope that it all works out because the models will be limited. I think I agree with you that we are seeing early signs in reasoning models, for example, that synthetic data can lead to quite impressive results. So I wouldn't hold out hope if you have the perspective that model collapse will prevent the models from ever kind of violating copyright in bad ways.
A
Yeah, I don't know, I tend to be. It's funny, while I have spent the last one and a half years really advocating for, I guess, AI development to take a pause and just say, hang on, why are we all building our models based on theft at the same time? I don't know if I'd call myself a futurist, but I tend to think that technology is going to advance a lot, that it's going to bring a lot of benefits. It'll bring massive risks as well. One of my general opinions, you know, is that with AI, ultimately we know humans are intelligent we know that we've managed to. Intelligence has already come about once. There is no reason that to me, there's no reason that everything we can do won't one day be possible, be sort of physically possible in machines. It seems pretty self evident that it's not impossible because we, it's already been achieved once. And I think that that's my starting point and it's why back in 2010 I was convinced that AI would be able to write music before long and I was a bit off in my timelines. I thought it would happen a little bit sooner. I also thought music would come before art and I was wrong about that. Art slightly beat music for I think interesting reasons. But ultimately the tech is going to be able to do this stuff. And so if we tend to, we have this problem where we get really hung up on the issues of today and they're going to be solved, they're going to be solved, they're going to be solved probably from corners we don't expect. It might not be OpenAI or anthropic who solves these things. It might be some new startup, right? But they're going to be solved and they'll be solved in the next few years. And then, you know, and then what? So you can't kind of rest on your laurels and think, well, it's fine because machines can't do this. I think is always a bad argument and always a dangerous path to go down because almost always they will be able to do that thing within probably a few years time.
B
It's very important here to notice the pace of change. Also if you say in 2020 models can't do this simple task and then you just wait a couple of years. Well then maybe they can. And I expect the same thing to happen over and over again, basically.
A
I agree. It feels like we're still in a current cycle of that happening. I mean, I remember meetings at Stability. It still feels like yesterday I was at Stability. I Remember meetings in 2022 at Stability where we were still talking about when is the year that AI music's going to break out. Like we will. I, you know, I, I was like it's going to be, I was like 2023. By then it was very close. It was kind of easy to predict. I got my prediction wrong back in 2010. I thought it would be, be done quicker than it was, you know, but these, Even back in 2022, AI music had like, had been far, far, far from sold. Right? Like it just, you know, it wasn't it wasn't working really. And, and this is true across modalities. I mean, it's happened in video, it's currently like the video cycle, I guess. I think it's happening pretty quickly in robotics where you're seeing in my mind robotics advancing very rapidly in many ways thanks to training models in simulations and then kind of transposing that over to the real world. I wouldn't be surprised to see it happen in other technologies like brain computer interfaces and things like that. Yeah, and I just think it seems to be almost part of the innate human condition that we extend out from where we are now. And we find it, I think, often very hard to picture and believe in rapid change as possible. We take today's limitations and we imagine that some version of them will still be there in a decade's time. And I think that's a mistake.
B
This is quite interesting because a key question here is how does the debate around copyright and fair use fit into these larger questions or grander questions about the future role of, of people and the future of humanity? What's the connection there?
A
Well, I think there's a couple. I mean, I think there's a big, big question around work. And this is something that, I mean, I'm, you know, I am someone who, I do worry about the downsides of AI. You know, I mean, I'm excited about some of the upsides I'm worried about, I'm anxious about some of the downsides. I personally think that one of the biggest risks in the near term from, let's call it general intelligence or superintelligence or hyper intelligent AI, basically AI systems that are supremely capable. I think one of the big issues that we're going to face in the near term is the potential mass displacement of labor. And I think the creative sector is kind of the canary in the coal mine here. You know, generative AI in this way has only been around a couple of years. We are already seeing data that shows that creative work is, the creatives are being out, competed. You know, there's data on this already from upwork. There's, there's a bunch of papers that show that, you know, freelance writing tasks, freelance graphic design tasks fell in the wake of, and never recovered in the wake of some of these models being released. You know, I know people whose income has just totally fallen because, and they know, they've been told that it's because the people who previously employed them are now using AI, you know, and so I think creatives are kind of the canary in the coal Mine here a little bit. I worry when I see, you know, I see companies like, just like a month ago, a company announced its intentions. And this company is called Mechanize. And you may have seen it, there was a big kind of kerfuffle around it about a month ago. And it's got investment from Jeff Dean at Google, it's got investment from Nat Friedman, Dwarkesh Patel, like all of these big names in the world of tech or tech adjacent fields.
B
Right. I interviewed one of the founders of the company recently. Yeah, there we go.
A
Right? Yeah, I confess I haven't heard that. I will listen to it. But the mission is automate or work. So this is not like some. And if you look at the investors behind this, this is not some sort of fringe movement, right? This is not some fringe idea. In Silicon Valley there is a real body of thought that says, yeah, given the combination of general intelligence, possibly superintelligence and general purpose robotics, there is a real possibility that actually we can automate if not all work. And let's be honest, it's probably not all work, right? Like, it's not going to be politicians, it's not going to be priests, it's not going to be sports people, you know, but there is a real possibility that we can maybe automate a huge amount of that work. And God, we're going to try because this is, you know, I mean, it's the kind of Marc Andreessen software is eating the world idea. Silicon Valley, you know, loves to, I mean, it's, it's ultimately about making money. Like it's ultimately about, you know, trying to replace different sectors. And ultimately, I think if Silicon Valley, and I use Silicon Valley obviously in, I don't really mean that. I mean, I happen to be here. What I mean is this kind of philosophy, this idea in the tech industry. But, you know, there is this kind of idea that, hang on, maybe this is our chance, maybe like all of these holdout industries where we haven't really been able to make inroads, maybe this is our chance. And actually maybe this is how we get into them, maybe this is how we take them over. And so that I think is a big, big thing and is something that I'm worried about. And to me there are arguments about whether we'll get there with the current kind of paradigm of large language models, of reasoning models of robotics, of where we are now. And I think that's a sensible debate to have. I'm not sure whether we will, I wouldn't say we necessarily will, but I think I'm concerned about two things, irrespective of that. One is there is such a huge amount of investment being poured into automation that I think it's very likely that we get new innovations coming along such that even if large language models don't end up being the route to AGI, something else might well be in the near term. Right. I don't think we should rule it out. I think we should consider it a significant possibility. And the second thing is just the desire to do it at all. Like the aim, the fact that it is people's aim to go and automate all work. And I understand the position that sort of says, well, the fully automated luxury communism or whatever. Right. Like all just relax the whole time. Let's basically have early retirement. But I think that's a very naive view given how political establishments work and given how obviously totally unprepared the world is. The US is any major economy is for the rapid displacement of labor. So that's a big concern for me.
B
Yeah, perhaps a glip question here is to ask why is this a bad thing? If you think of a peasant 300 years ago, maybe there would be some worries about automating agriculture to a large extent, but historically this has turned out great. We've been able to massively increase productivity and living standards and so on. Why isn't this just the next turn of the wheel of that trend? So in a sense, why should we be pessimistic here?
A
Well, I think one reason I'm not saying we should only be pessimistic, by the way, I think we need to be very alert to this possibility and we need to be acting accordingly. It doesn't mean we have to be pessimistic about it. I think one difference, which I think is pretty obvious, is the nature of the technology we're now building, which is much more general where the intention is to be general purpose. And this makes it, I think, very, very different to, to many of the revolutions of the past. Again, simply the fact that you have some of the biggest names in the tech industry investing in a company whose mission is to automate all work shows that there is a different idea here at play than there has been previously. Right. So I think we should take that seriously. So then I think the question is, okay, well, if you assume you hit that, why is that an issue? Wouldn't it be great to all relax? And I think here one of my concerns, and this comes back to the copyright question, is actually that what this whole copyright fight has showed me Maybe more than anything is that a lot of the people at the forefront of building this stuff honestly seem willing to trample on people's rights in the pursuit of personal gain and profit. That's in my mind basically what's happening here. They see an opportunity for vast wealth, vast riches, and they look at copyright and they think, well, that's getting in the way. The people whose livelihoods depend on copyright, the people who they're putting out of work, not really an issue for them. And that worries me because if at this stage the people building this technology aren't going to respect people's rights, aren't going to take seriously when a whole chorus from an entire industry turns around to them and says, what are you doing? Why would it be any different? Why would they have any more respect for people? They're not paying copyright holders. Why will they ultimately pay anyone? How is this money going to be distributed to other people if it's not being distributed at the moment? So I don't see the political will there to take seriously this idea that there might need to be that kind of mass redistribution. Demonstrably, the AI companies themselves don't have that will. So that's why again, I'm not saying we should think we're all doomed, I'm not saying we should just be purely pessimistic about this. But I do think we should take it. When highly funded, highly motivated, very smart people say we're going to try to automate all labor, I think we should take them seriously. Basically.
B
What do you think the future of culture looks like? So what will be the long term effects of having these technologies that can basically create replicas or copies of different styles? And when the price of generating text, imagery, videos, audio drops massively, what happens to culture?
A
I mean, I think one of the most immediate effects is that a lot more people find it a lot harder to kind of get into the creative industries because a lot of the sort of long tail jobs that would have supported them maybe through their early years will go. And that might be writing jingles. If you're a musician, like writing ad jingles, it might be writing production music, it might be doing some sort of copywriting. If you're a writer, all these jobs are already on a downwards trajectory largely thanks to AI. And I think that's a major issue because I think there is going to be a big blow to the creative industries and that will have knock on effects. So I think that's going to affect culture. I mean, I think there is. We will also see and there's a question as to how popular this will be. But I think we'll also see the rise of remix culture, like a further rise of remix culture. I think it's at the moment you are not allowed for the most part to replicate people's voices or to take a copyrighted song and rework it into something else without permission. It can be hard to get that permission. But I'm sure media companies, rights holders, creators are open to licensing their voices, their likenesses, their music, their works. They're open to licensing under the right conditions. And as soon as licensing gets done at scale, you will have the ability to, as a consumer, remix. Right. I think the barrier between being a consumer and a creator will start to not fully disappear, but at least weaken. So you'll be able to say, if you want, like Daniel Taylor Swift, song's cool, but can I hear Noel Gallagher singing it? Right, That'll be a possibility. Of course it will. You'll be able to have the AI model. As long as you get the licensing in place and the permissions in place, I think that's all going to kind of be possible. I mean, I think there's a question, an open question as to how popular that kind of thing will be. And I think a lot of people in tech assume that it's the future. And while I think it will be a part of the future, I actually have quite high faith that concrete works, we can call them, so that's recorded music or a book in its final form or whatever it is. I have quite high faith that that will for a long time remain the norm in creative culture. Now, some of that will be generated in the first place by AI. But I think that ultimately we are very attached to, we're very attached to the idea of these concrete works that we can all share, that we can all talk about, that can be part of the public consciousness in a way that I think hyper personalized content won't be as exciting to people. So I think that is a big reason that it will stick around.
B
You don't think that consumers would be interested in, for example, talking to an AI Taylor Swift and then having a video call with her, she's playing music for you, you say to her, I want to hear a different song, I want to hear more of this, less of that. Or you could imagine a book that expands in the sections that you're interested in. So it kind of changes as you're reading it. But of course then you don't have, as you mentioned, the kind of, the Cultural common knowledge of what's in a work. And would that be the barrier to, to creating this more interactive form of entertainment?
A
Yeah, I mean, I think there are two things at play there, right? There's the ability to interact with an avatar of a creator and there is the ability to have personalized content in the media form that creator is known for, kind of written for you. And I suspect that the, the former will be exciting and the latter will be less so. Like, I think that absolutely, like the ability to. And I don't know whether you'll be speaking to your Taylor Swift avatar or whatever, but the ability in general, I'm a big believer in the future of, you know, essentially the voice and language interface and just being able to talk to your computer, to digital devices, to avatars. Like, I think that's clearly the future, that's clearly the way things are going to. So I think absolutely people, consumers will probably want to. I can absolutely see music fans wanting to kind of have some kind of virtual conversation with their favorite artists. But I think ultimately my bet would be that when they then say, can you now play this? The. This will be one of their songs. I don't see if you look at kind of some of the AI music generators, right? Like one of the ways they advertise themselves is write a song about anything and your trip to the coffee shop or your mum's birthday or whatever it is. And in general, my impression so far is that it's an absolute moment of magic. The first time you hear it, you cannot believe it's possible and then you never use it again. You know, like, my bet is the retention figures for that kind of usage are atrocious because it's like people just don't really want that use case. There are use cases for AI music 100% right. But I think personalized music in that way, with that kind of like write me a song about X. I don't see as becoming like a very big part of music culture. I think that the song as an entity and as a kind of thing that's set in stone over time will remain a really, really important part of music culture. And I think that expands to all the arts with their own kind of these forms set in stone.
B
Do you think AI could change the winner take all scenarios we see in music where you have basically the, the most famous and influential musicians getting, getting most of the plays and most of the views and so on. But if you, if you had the opportunity or if you had the ability to create your own music Maybe you would see more of a kind of broad market.
A
I don't think so, because frankly, because there's already a broad market. Like, and this is one of the, you know, this is one of the things about Generative AI, it's, it lets you create music from scratch. But when we say it kind of democratizes creativity, I mean that's kind of only true to an extent. Creativity is already pretty democratized, right? It's not perfect by any means. It helps if you have rich parents and go to a school that lets you study it. And there are all these things that really, really help. But ultimately anyone or like a lot of people can write music, can learn how to write music, can produce music, and a lot of people do. Like, the amount of music before Generative AI came along, the amount of music out there being released every day was absolutely astonishing. So there is no shortage of music, there's no shortage of options. And ultimately what you have despite this, you know, this huge abundance of music is you have a few people rising to the top, you know. And why is that? I think one, it's because to an extent it's inevitable in the kind of culture we have people, you know, people get popular and this connected to. Two, a lot of that doesn't come from like how this stuff is made. It comes from like recommendation systems. You know, there's, there's kind of a bunch of Spotify researchers in 2020 wrote a paper where they looked at one month in July 2019 and they looked at listener data of like 100 million Spotify users. And they found that when, and this is predictable, right? But they showed it with the data when people listened to the recommended, listen to recommended songs, listen to kind of recommended playlist, that kind of thing. The diversity of the music they listen to is far, far smaller than when they take what they call like user directed action. When they, when they're just going and searching for music, they find a load of cool stuff. You know, it's really diverse. When you go down these recommend recommendation paths, you know, everything kind of becomes homogenized and you end up listening to the same thing over and over again. And so I think, I think that's already a trend we're on, right? Like recommendation systems through YouTube and TikTok and Spotify. This is already a path we're on. And so I don't think that AI letting more people, we can say create, but letting more people generate music from scratch, I don't really think does anything to affect those winner take all dynamics.
B
Do you Think there's a general effect of culture becoming more homogenized over time, where perhaps you will see in the future most cultural products being having this kind of style, this AI style that's influenced by say how culture works in Silicon Valley and the values that are put into the models from there.
A
I don't know, I mean the flip side of recommendation systems like TikTok, you know, and I worked on the TikTok that my job was to work on the TikTok recommendation algorithm. The, the flip side of these kind of models is that you get different people ending up having very different feeds. And now that can turn into filter bubbles, which isn't necessarily great and you can go down some bad paths. But ultimately these recommendation algorithms, especially when you've got short form content and you're using them a lot, which most people do, they very quickly learn the kind of thing you like and they show different things to you to try to work out what that is. And so that's why you have different pockets of TikTok, right? That's why you have all of these different styles emerging. And so I think actually recommendation systems can, if they're constructed right, take you down these good paths of discovery. And so I think in that sense culture doesn't become too homogenized. Now that's always maybe going to be these kind of fringe niche communities, but I think it kind of is up to the. One of the exciting things about where we are, I think, is a lot of it comes down to the user. If you as a consumer want to go and find some interesting stuff to listen to, to watch, to read, it's never been easier. That's what's exciting. So I think we're in this interesting time where you've got these two extremes. On the one hand, if you just can't be bothered as a consumer, everything will be homogenized and you'll just hear the lowest common denominator stuff. A lot of that will end up being AI slop. It'll be awful. But if you can be bothered, you can go and find really great stuff and the tools are there at your disposal to do that. And I think the extension of all of this is that I strongly. The backlash to generative AI is just massive. And I think people in tech still underestimate this. They underestimate the huge strength of feeling against generative AI. And this partly comes from the fact that artists work is being stolen to build it, and it partly comes from the fact that these companies are out competing artists and it partly comes from the fact that people consider generative AI to be kind of dumbing down their professions and art in general. And it's a whole host of reasons, but it's really strong. And this is, you know, people in tech like to sort of point to the introduction of the camera or the synthesizer and say, look, people rejected recorded music when it came out. They thought it would be the death of music and it wasn't. And people got over it. And they're looking at AI and they're saying the same thing. And I think that's a mistake because I think that the strength of feeling is much, much greater with AI And I think what that's going to lead to is what I kind of think of as like a new kind of humanist movement in the arts, which I suspect will essentially entail probably a rejection not just of AI, but also probably as a kind of result of that, probably of some other technologies as well. And I wouldn't be surprised to see in music, for instance, a kind of humanist movement emerging that that is maybe more acoustic in nature, that favors less production, that favors live music, acoustic instruments, things that a machine couldn't do, being there in person with someone, these kinds of things I think will and I think that could be a strong artistic movement that I'd be surprised if it doesn't strengthen over the coming years.
B
What people are searching for is perhaps a sense of authenticity in what they're consuming, that what they're, what they're enjoying is something that's coming directly from another person and not something overly produced and perhaps AI enhanced.
A
I think so. I mean, look, at the moment you wouldn't necessarily get that impression from maybe the press or like if you're on social media just, just because the, you know, the AI chorus is so loud and so hard to avoid. You know, there is just so much money flowing around the AI ecosystem right now that you know, it benefits people to become basically AI influencers who will constantly share content, who will sort of say, look, this is changing the world, Hollywood is dead. You know, everything will be generative in two years time. And mostly people do this because they'll get more followers that way, they'll make money that way. It's all a money making game. But ultimately among, I mean there's a huge rejection of much of AI companies practices from musicians. For instance, I mean you look at what we've been doing in the uk, we organized a protest album called Is this what We Want? And a thousand British musicians kind of co created, co sponsored this silent album in Protest at the government's plans to give their music away to AI companies for free. And it included these absolutely huge artists, right? Like Kate Bush, Max Richter, like all of these people, like these. And the same is happening across all the arts. You're getting some of the biggest creatives in the world coming up pretty strongly against. Now, it's not necessarily against all generative AI, but it's against some of the very common practices that AI companies are utilizing. And I think inevitably what that turns into is a movement towards the authentic and towards the natural and ultimately towards the human. And I suspect that will get bigger and bigger.
B
What would be the principles of this new type of humanism? What is it that such a movement would be trying to promote and trying to reject?
A
I mean, I think fundamentally at its core, any kind of new humanist movement would essentially be about putting humans first. And that's obviously very high level and vague. But I think probably initially it'll be more about a rejection of a few specific practices than it will be a specific creed or kind of set of guidelines. Right. And I should say I'm not sure this movement exists, but I just see it as a broad direction of travel. I think there are things that would clearly not align with the sort of new humanist movements, let's say, in the arts. And one of those is obviously taking people's work and training models on it without their permission, when they're all en masse telling you they consider that theft. That's not humanist. I think building models that are designed to out compete humans in the creative spaces, for instance, would not, I think, fall into this category. Now I think it would remain pretty vague and pretty all encompassing, but I just think there'll be certain aspects of the current technological paradigm that will not be accepted by this, by this group. And as I say, I think what you'll see is you'll see real time interaction between people. You'll see a rejection of some of the most modern technologies and a reversion to traditional practices. And I think still look, the best songs ever written, you know, they were recorded on modern technology. I don't think that any new humanism would totally reject modern technology. But ultimately, you take Yesterday by the Beatles, you can just play it on a guitar, you can just sit in a room on a guitar, play it to one other person. And I think that is humanism, right? It's that versus a song that was composed on GPUs by a model that's been trained on the Beatles work without permission. And that then someone shoves onto Spotify in order to go and make money and take it away from the royalty pool of other musicians. That is the difference between these two movements.
B
I think one worry with such a new kind of humanism would be that it would face this set of values, would face competitive pressures from other groups, from other companies, from other countries and so on. And so perhaps a humanistic approach is not the most efficient approach and therefore it will fade away just because people, as you mentioned earlier, people, consumers, will grab what is best and what is easiest and there will be demand for whatever is most efficient to produce what it is that consumers want.
A
Yeah, I think in general that's true, but I actually think in creativity is not necessarily true. Creativity isn't all about efficiency. The most widely loved songs and films in the world were not made in a manner where efficiency was the bar. That's not what people are going for. They're going for creating something great. And I think that ultimately, again, when countries have all these, you look at the AI race and you look at people, you look at countries worrying about, will China get AGI before us, whoever US is, and what does that mean? These politicians who are worried about that, I don't think are really thinking about creativity. They're not thinking about the creative industries in all of this stuff. Politicians don't mind that much if the next AI music generator is built in their country. Right. Like when they are looking at. When they do come up with, like, for instance, the UK at the moment, the UK government is just like all out favoring AI companies. It's like, you know, I mean, it's astonishing. And this is a party that is meant to stand for, it's called Labour. Right. Like it's meant to stand for working people and it's just basically in the pocket of big tech companies. The reason they're in the pocket of big tech companies, I think, is not because they are desperate for the next AI music company to be built in the UK or the next AI image company to be built in the uk. Frankly, I think they'd probably rather it wasn't right. They do value human creators. They certainly don't want human creators work to be stolen in the way that these companies are stealing it. I think their primary concern is AGI and that's a very different thing. And they don't want to be left behind. And that I think is driving a lot of politics. So. So I don't think efficient when you speak about creativity, I don't think efficiency really comes into it and is what is driving decisions or ultimately driving consumers decisions as well. Like, I don't think it's, you know, you're never going to listen to a piece of music because it was made efficiently. Like, you'll have an AI influencer being like, this is wild. Like 10 songs I made in under 10 seconds. You'd never tell they weren't human. Cool. You'll get like 50 retweets. But no one's ever going to listen to that song again. No one's going to care. And I think that's good.
B
Yeah, I agree that when politicians worry about competition with China and so on, they're thinking about national security, they're thinking about autonomous weapons, they think about AGI and superintelligence and perhaps not as much about generative AI. But I mean, depending on the scenario we're in, if we're in a bit of a longer timeline scenario, it might be the case that having control over culture is a kind of soft power in the world as it has been, I think during the 20th century, for example. Do you think countries will seek to influence how culture is produced through AI with the goal of kind of projecting soft power?
A
I don't know. I mean, I don't immediately think so. I mean, I think that again, this is actually why I think that it's so crazy that people like the UK government are so strongly considering basically upending copyright law to favor AI companies and to penalize creators. Because what you're going to do by that is you're going to increase. You're basically going to increase the amount of slop out there. You're going to eat into real creators royalty streams, you're going to eat into the royalty pools, you're going to make it harder and harder to build a career as an actual creator. That's basically what you're going to do. So you're going to undermine what a very strong industries at the moment. I think soft power through the creative arts, I suspect will largely continue to come from where it does at the moment, which is from having supremely talented people backed by great industries. And I think that's what countries like the US and the UK have at the moment. And I just think that anything you do to undermine that is probably a very bad idea. So that's kind of where I come out on it. But I don't know, as a last.
B
Question here, what are your priorities, say for the next few years with fairly trained, what do you find the most important to do?
A
Like, honestly, right now, I think the creative industries and the people who make them up, the creators, the rights holders. This is a group I care a lot about, I'm a member of it myself. But my entire career has been kind of working with this in mind. And I think that what they face right now is like an existential threat to their industries, to people's ability to make money from being creative and therefore to the art that we all consume. I think that this is the biggest threat that these industries have faced in certainly in living memory and probably going back a long way before then. Trying to shift the Overton window, I guess, trying to move the needle towards any kind of outcome that is fairer than the current circumstances for creators, I think is really important. And I think there's a bunch of ways you can. There's a bunch of strategic questions about how you do that. We're trying to do that with fairly trained by showing the kinds of models that, that are possible without theft. There are lots of other ways you can do this as well. Marketplaces of training data. There are great companies building marketplaces of training data. There are people building public domain data sets. There are people doing all this stuff to make it easier to train models that aren't based on theft. There's people working on trying to disentangle the, I guess the, or shine a light on the black box of these models such that you can start to understand which training data has gone into a particular output, more or less. There are people doing all these kinds of areas of research. I think all of this is really important, but I mean, in general, I think if we continue on that, if AI companies win, God forbid, all of the lawsuits in this space, you know, and, or it becomes like settled law that you're just allowed to take people's work to build AI models that will compete with them. You know, I think that is a terrible world for creators and I think, you know, there are lots of people, myself included, basically just 100% focused on, you know, how do we try to kind of just gently guide the path in a slightly different direction. I don't think we're going to stop AI development. I don't think we necessarily should stop all AI development. There are people who say ban AI from the creative industries. I think that's unrealistic and I also don't think that's the right approach. AI is a very broad field, but I certainly think we don't want to end up where the big tech lobbyists at the moment want us to end up.
B
Yeah, Ed, thanks for chatting with me.
A
Good to chat. Cheers.
Podcast: Future of Life Institute Podcast
Episode: Will AI Companies Respect Creators' Rights? (with Ed Newton-Rex)
Date: June 20, 2025
Host: Gus Ducker (B)
Guest: Ed Newton-Rex (A)
This episode features Ed Newton-Rex, a classical composer and long-time AI entrepreneur, discussing one of the most heated debates in AI: whether companies building AI systems are, or should be, respecting creators' copyrights. Having resigned from Stability AI over copyright issues, Ed provides an insider’s perspective on how industry norms have evolved and why creator rights are under threat. The episode covers Ed’s background, the evolution of AI-generated music, the legal and ethical landscape, his new initiative (Fairly Trained), and the broader implications for the future of creative industries and culture.
On the standard industry approach:
"Yeah, that's absolutely the standard industry attitude right now. ... No one trained on copyrighted work without a license for a very long time. ... Then two or three companies ... thought, let’s just release this, let’s see what happened. ... Everyone copied them." – Ed [16:33]
On open-source AI models:
"Truly open models ... there is no downstream limitation to how the model can be used. ... open models are irreversible. You can't take [them] back." – Ed [21:44]
On creators’ existential threat:
"The creative industries ... face right now is like an existential threat to their industries, to people's ability to make money from being creative and therefore to the art that we all consume." – Ed [84:18]
On humanism and the reaction to AI art:
"I think what that's going to lead to is ... a new kind of humanist movement in the arts ... a movement towards the authentic and towards the natural and ultimately towards the human." – Ed [69:49–75:48]
Ed is passionate, principled, and pragmatic. He’s deeply concerned about the direction of AI and creative rights, but maintains optimism that advocacy, legal reform, and ethical entrepreneurship can help shape a better future. The conversation is frank about industry pressures, yet appreciates both technological potential and cultural risks.
For more in-depth resources on policy, advocacy, or the Fairly Trained certification, see Ed’s organization or related literature on creators’ rights and AI transparency.