Loading summary
Joseph
Foreign welcome to the 404 Media podcast where we bring you unparalleled access to hidden worlds, both online and IRL. 404 Media is a journalist founded company and needs your support. To subscribe, go to 404 Media Co as well as bonus content every single week. Subscribers also get access to additional episodes where we respond to their best comments. Gain access to that content at 404 Media co. I'm your host, Joseph, and with me are 404 Media co founders Sam Cole.
Sam Cole
Hello.
Joseph
Emmanuel Mayberg.
Emmanuel Mayberg
Hello.
Joseph
And Jason Kebler.
Jason Kebler
Hi.
Joseph
So we're doing something unprecedented for 404 Media next week. We are taking the week off. We have not done that since we launched in around August 2023. Obviously we've each taken days off here and there, but we've never taken extended time off as a company. So we'll be off for the week. I notice other podcasts do this as well and they'll do a rerun or something like that. There should be a podcast up next week anyway. It will be probably an interview that Jason recorded. We're still working out the specifics and we actually want to do more of those interviews as well, so keep an eye out for that. But you won't be getting the normal show next week. We're getting something a little bit different. Let's go straight into this week's stories though. This is one Emmanuel wrote. FuckLAPD.com lets anyone use facial recognition to instantly identify cops. I guess just first of all, Emmanuel, how did you first come across this? Because I came across it as well and I'll tell you how. But how did you first see this?
Emmanuel Mayberg
I think, oh, I didn't know that you actually saw it also. So I'm interested. And knowing I found out about it via a subreddit that is dedicated to monitoring like very aggressive ICE crackdowns and ICE raids. And they were just sharing it as a possibly useful tool. Where did you see it?
Joseph
So I saw it on TikTok where a person was using this tool and we'll explain a little bit more about what it is and sort of everything about it in a second. But basically it's as the headline says, facial recognition tool to identify cops. I saw somebody on TikTok using it, goes up to a police officer, points their iPhone or whatever camera into the face, brings up the guy, the cop's name, reads out his salary to the cop and then the cop starts laughing a little bit. I was going to try and find it again, but I couldn't Find the video.
Emmanuel Mayberg
How much did he make you remember?
Joseph
It was about 100k. And yeah, you can see him sniggering. It's very, very funny.
Jason Kebler
But I can I tell you when I saw it because I either hallucinated this, but I thought I saw something like this like a year ago, to be totally honest with you.
Emmanuel Mayberg
We talked about this. Yeah.
Joseph
That's why when I saw it at the weekend, I was like, haha, that's funny, they're still using that tool and I didn't actually realize it was new. So sorry, I don't know what it was.
Jason Kebler
I don't think we wrote about it. I don't remember where it was. This is not like terribly interesting. I wasn't here yesterday though. But it's like I either dreamed about this or I saw a site that looked really similar to this.
Sam Cole
Like there is another site.
Emmanuel Mayberg
So let me explain what this is and then we can talk about why we all thought it already existed. Which is the first thing I asked when I pitched it in Slack. I was like, isn't this old? Anyway, fucklapd.com is a site where it has a very simple interface. All you can do is upload a single image of a LAPD officer's face and it will use facial recognition tech and a database of, from what I can tell, most of the LAPD workforce to automatically identify which police officer it is, pull up their name, badge number and then also yes, their salary. All of which are public records, which is why this data is available. And the reason this is something that has been launched now according to the guy who made it, Kyle McDonald, he launched the site on Saturday. And this was in response to the anti ICE protests in la, which got very heated and there were many instances of police violence against protesters. As everybody I think knows by now, the National Guard was called in, there was a huge confrontation between the Governor, Gavin Newsom and Trump. And yeah, in response to LAPD playing fast and loose with protesters, he just created a tool that would allow people hopefully to identify cops who sometimes. I haven't seen this happen in the LA protests recently. Maybe Jason knows more, but definitely I remember during the George Floyd protests in New York, there were instances of cops who just like covered their badge number with a piece of tape.
Joseph
Very big during blm.
Emmanuel Mayberg
Yeah, right. Which they're supposed to keep visible for the purpose of allowing the population to like identify who, who is enforcing the law and who is maybe using too much force. And this would allow you to get, to get around that, that trick that some Some cops use.
Joseph
Yeah, I haven't seen them doing it much in the anti ice ones, but, you know, there have been clips of horseback, LAPD officers beating up people when they don't think the cameras are looking and all of that sort of thing. Plenty of examples. So you spoke to Kyle MacDonald, who is an artist who made this site, and they've done other work, which I think we'll touch on in a minute. You spoke to him just a little bit more explicitly. Why did he make this? Is it for transparency? Is it for accountability? What does he tell you when you asked him, hey, why did you actually make this?
Emmanuel Mayberg
Yeah, so first of all, it's Kyle McDonald. He has a website called KyleMcDonald.net. i really recommend people check it out. He's been working for more than a decade now doing this kind of work, doing exactly the kind of work that we used to cover all the time at Motherboard, and I have covered a few times here. 4, 4. Just subversing technology, using it in unexpected ways to make art, and often make pretty explicit political statements in terms of this project. Let me try and stumble through his quote here because I think it's pretty good and direct. He told me, we deserve to know who is shooting us in the face, even when they have their badge covered up. FuckLAPD.com is a response to the violence of the LAPD during the recent protest against the horrific ICE raids and more broadly, the failure of the LAPD to accomplish anything useful with over $2 billion in funding each year. So I think that that is pretty. Pretty good summation of why he's doing this.
Joseph
Yeah, that makes sense.
Jason Kebler
So, Jason, I can. Okay, so now I remember where it came from or, like, why I thought that this existed previously is back in 2023, there was a website called Watch the Watchers. And it was a database of officers, headshots, names, higher dates, ranks and ethnicities that was compiled using public records requests. And the website looked sort of similar, and you could search by badge number, and it was billed as counter surveillance. Emmanuel, do you know if this is related?
Emmanuel Mayberg
It is, yeah. So Kyle really wanted to stress. So the organization you're talking about, Watch the Watchers, is part of something called the Stop LAPD Spying Coalition, which is based in skid Row in la, which is a neighborhood with a lot of homeless people and a lot of, like, people really having drug problems, mental health problems, that kind of like a notorious part of the city. So they created this database. They're not affiliated with fucklapd.com, but that is where he pulled the data. He is just using facial recognition to match the headshots that are in that database that then pull up the information about the individual officers.
Joseph
Yeah. Rather than searching a badge number, you search a face, which, practically speaking, is probably going to be a lot more useful to people in the protest. Well, first of all, because if a badge number is covered, obviously they're not going to be able to get it. But also, just in the sort of hecticness of the situation of a protest, it might be easier just to point a camera at somebody and then get their photo and then do it that way. So Watch the Watchers gets this data through public records requests, which I haven't seen the actual language of that, but presumably it's like, give us all the information of your staff.
Emmanuel Mayberg
And I think they have to sue for some of that. It wasn't like an easy. It's like. I think. I mean, not to disrespect Kyle's work here. I think it's very good and interesting. But Four4Media is huge. Fans of FOIA. Like the. Really, you have to admire the work of just, like, getting this data out.
Joseph
Oh, yeah, yeah, for sure. And it looks like it works because you tested it a few times. Walk us through that. What did you do and what were the results?
Emmanuel Mayberg
Yeah, so I wanted to kind of give it an easy test case. So I just took a still from an LAPD press conference responding to accusations of police violence during the protest, actually, and I just screenshotted one of the faces of the cops that was, you know, a front shot of his face, very clear. And I uploaded that to the site. Within a few seconds, it pulled up nine results all. And each result is like headshot name and a link to the Watch the Watchers database page for that person. And they were all. The officer was like a bald white man. And it showed me nine officers that vaguely match that description, but the first one was the correct one. So it worked perfectly in this case. The site, the fuck LAPD site is clear that it has limitations, one of which is, you know, if you're pulling a TikTok video of some incident involving a LAPD officer and it's far away and it's blurry, it's not going to work. And then obviously, at least when it comes to ice, and that's a different. ICE officers would not show up in this database. But if the officer has their face covered, obviously that wouldn't work either.
Joseph
Yeah, yeah. And I mean, just because I didn't really spell it out. But the reason that I thought, oh, this has already been done, is because it had been done in other cities. We saw an example I think Sam alluded to there was the New York one and then there was Portland as well. And again, this was mostly during blm, that sort of thing. So I see facial recognition tool use on police. I'm like, oh, okay, they're still doing that. And no, this is an interesting marriage between facial recognition tech and public records. And I think it's particularly interesting in that it's all done locally on the device, which you couldn't have necessarily done even a couple of years ago. Would that be fair, Emmanuel?
Emmanuel Mayberg
I think so. Kyle, who made this website himself, has made a similar tool for ice called ice buy back in 2018. And that tool, I believe, doesn't work so well anymore, both because it doesn't have the like, the data isn't as good, right. It doesn't have this Watch the Watchers database to work with. But I don't believe that one runs locally.
Joseph
Well, when it launched the ICE Spy 1, which is to basically do the same thing, but to identify ICE employees. And that's not just immigration enforcement. It could be a lawyer in Texas or something like that. But ICE, broadly, there's two things. It used Microsoft's facial recognition system and API, and it was just using that to process the actual identification of faces. Microsoft sunset that after the project was released again in 2018. But now Carl McDonald's has brought it back because the processing can now be done locally on the device. So you're like, oh, okay, they're going to relaunch a tool or whatever. Obviously that's especially notable, a tool that would identify ICE when there's mass deportations happening. And of course, ICE officials are repeatedly covering their faces. And maybe I can talk about that in a second. But there is a second component, which is the ICE Spy database is built on a scraped set of profile images and data from LinkedIn. So what somebody had done back in the day was go in, scrape LinkedIn for anybody it could find who worked at ICE, grab their name, their title, the city and their profile photo. Now, sort of a scary, depressing way, 2018 was actually a long, long time ago. Several, several years stuff has changed. Administrations have passed, people have joined ice, people have left ice. And I tried out ICE Spy again just for verification purposes. And frankly, it was bad. It didn't get any of the results I uploaded. And I was actually giving it low balls just to start. And I was actually downloading images from LinkedIn of ICE officials and then running it through and they didn't recognize. I would upload a photo of a black man. It would just present a photo of another black man, and it was not accurate at all. That being said, that's probably not the facial recognition algorithm and system itself. It's because the data is out of date and it's sort of doing the best it possibly can with the data. Right? And I guess just the last thing I'll say on that is that the reason that people would probably want this for ICE specifically is because as many of us have seen, ICE continues to carry out deportation efforts while not just covering their badges like police may have done in earlier protests, but full on sunglasses, a baseball cap covering their neck, masks as well, refusing to say what agency they're from, refusing to provide their name. So facial recognition could actually provide a moment and a tool that people could actually use to identify who are these armed men who are saying, you need to get into my truck now and we're leaving. And I'm from ice. And of course, that refusal to identify themselves, that refusal to provide an avenue for accountability has created chaos, where a group of four or five alleged ICE agents or whatever will go to a car wash or a Home Depot or whatever it is. And the civilians there really cannot tell whether this is actually a group of ICE agents or not. To the point now where we have criminals actually impersonating ICE officials, including in Brooklyn in an attempted rape. Elsewhere we have robberies as well. So they've created this environment where no one can be sure what is going on, which is of course beneficial to them in some respects. But now there's this narrative on Bluesky and other places that, oh, none of These people are ICE officials. They are just bounty hunters or they're January 6ers or whatever. And don't get me wrong, I think eventually it will probably turn out that some of them were opportunistic, random members of the public. But I think a lot of them are still ice. And a tool like this could help people figure out, okay, well, this ICE agent beat up my family, punching him in the face before throwing him to a van, and now I can identify him and at least file a lawsuit or contact the agency, find out where he's been. But without this, there's very little recourse for identifying an agent or identifying the agency as well. So I guess we'll see if they improve that tool.
Jason Kebler
I was just going to say, I think that tools like this are super interesting and Definitely we should be writing about them. People should know that they exist. I think the use cases of them are tricky, which is just to say again, to like generalize. An argument I've seen on Blue sky is like people are saying if you get grabbed by a masked man, like demand their badge, demand that they are, you know, a federal officer, et cetera, et cetera. And it's like that sounds great in theory. I think in practice when this is happening very quickly, it's like a very scary situation. I've seen actual lawyers saying like, yes, this sounds like great in theory. Fighting back against it. If you like. If you assume that someone who is kidnapping you is not a federal agent and you then assault a federal agent, like fighting that in court is potentially like life ruining, very difficult situation. And it's just like that. That all just sort of underlines the fact that when these people are wearing masks, it is like, how are you supposed to respond as someone who is being, you know, grabbed by someone with a mask? I guess especially in the context of what happened in Minnesota where the, you know, alleged guy who assassinated, you know, a state lawmaker there was dressed up as a police. Police officer or dressed up as law enforcement. And so you have like alleged assassins dressing up as law enforcement. You have actual law enforcement masking themselves, not identifying themselves. And it's like the, it's just a very, very like scary and difficult time at this moment. And I think that tools like this are really useful when there's already footage and you can sort of like figure this stuff out after the fact. I think during the fact, it's like, if you're like, hey, wait a second, like, let me, let me take a picture of you and look you up, like while you're trying to arrest me is like, it's just a very, very volatile situation. And then I guess the last thing is like, I do kind of wonder, like, I do think that this tool is very useful. I wonder if ICE and the LAPD use the existence of tools like this to say, look, we need to mask like leftists are doxing us type vibes and surely they will. I think that is an argument that will be made. That doesn't mean that tools like this aren't useful. But there's this narrative that the administration is putting out right now that like ICE agents are being harassed and attacked and that's why they're masking. And I think that that is like a really. It's just a tricky situation.
Joseph
Yeah. ICE has said the assaults of his officers have increased something like 400%. There's an opinion piece and the Washington Post broke it down. Those numbers are a little bit iffy in that. Well, we don't know the actual number as a percentage. Like, you know, do they go from 5 to 10, 20 or something like that?
Jason Kebler
And also, ICE is putting themselves in a situation where people are more likely to fight back when they roll up with masks on, people working at Home Depot or hot dog stand vendors or things like that.
Emmanuel Mayberg
Or if you increase enforcement by 400%, would it not be logical that incidents would increase by 400% or something?
Jason Kebler
I'll just say also, being in Los Angeles and like going around the city right now, there are places where there used to be vibrant communities of people that are empty right now. And it is really dystopian and really scary. It's like, I went to Home Depot the other day and there usually are people there who are like, I will help you install this drywall that you're purchasing. Or like, I will help you lay down this sod that you're buying or whatever. Like, that is a common thing that I've experienced every single time I've been to Home Depot. I went to Home Depot this weekend. No one there. Like, no one I saw. I've seen some posts on Reddit, like after concerts. There's like no hot dog stand vendors there. There's like a lot of farmers markets that are totally empty. There's like secondhand, like thrift markets that are, that are empty. And it's like, regardless of whether these people are undocumented or not, you have like anyone who has brown skin who is like scared to go out in public at this point. It's like a lot of my friends are scared to go out in public at this point. And it's, it's really, really messed up. And it's, it's like, again, Los Angeles, like, doesn't want this to be happening. The community doesn't want this to be happening.
Joseph
Yeah, I think that's a good place to leave it. And we'll definitely keep an eye on this tool and if any of us are made or improved or whatever, I'm sure we will do a follow up after the break. Sam is going to ask Jason all about his latest piece, which is about a frankly, pretty complicated court ruling that we spent a long time trying to get a headline for. We'll be right back after this.
Jason Kebler
I'm here to tell you about the limited edition inbound conference which brings you to San Francisco for a one time only west coast event this September 3rd through 5th where you'll get insights you won't find anywhere else. Inbound 2025's agenda is now live from the Agent AI workshop from idea to agent and Dwarkesh on AI's future research backed Bold Predictions with Dwarkesh Patel. Explore more than 200 sessions you won't find anywhere else. Get fresh perspectives on innovation from a dynamic lineup including Sean Evans, the host of Hot Ones, the one and only Amy Poehler, tech reviewer Marcus Brownlee, and AI pioneer Dario amade. Only at inbound 2025. Cut through the noise with focused, actionable takeaways on the latest marketing, sales and AI trends that give businesses a competitive edge in today's rapidly changing landscape. Network with decision makers in San Francisco's AI powered ecosystem where innovative technologies are creating entirely new approaches to business. Experience firsthand How San Francisco's technology ecosystem and is revolutionizing content creation, distribution and monetization through AI and innovative tech solutions. Secure your spot@inbound.com register that's inbound.com register.
Joseph
Hey, it's Joseph again. If you're a new listener to the 404 Media podcast or even a long time one, you might not be aware of all of the impact our journalism has had recently or how we even got here in the first place. In 2023, the four of us quit corporate media to go independent. We were sick of working for a VC backed company that put profits before journalism that gave birth to 404 Media. Since then, we've stopped the spread of AI books in public libraries, triggered class action lawsuits against AI companies, got Congress to to pressure big tech in various ways, and we've even shut down surveillance companies. This real world impact is only possible because of our paying subscribers. As a journalist owned business, they are the engine that powers our journalism and where the vast, vast majority of our revenue comes from. So please consider signing up today for $10 a month or $100 a year at 404 Media Co membership and get bonus content every week and access to all of our articles. Thank you and enjoy the rest of the podcast.
Sam Cole
And we're back. We're going to talk about a story that Jason published today. The headline is Judge Rules Training AI on Authors Books is Legal, but Pirating Them is Not. This story went through so many rounds of headline workshopping because it is kind of like a weird, not really complicated, but like it doesn't. It's a little bit hard to get your mind around what exactly the Judge is talking about here. I think one of the Headlines was like judge being annoying, training cool piracy bad Judge says it just is hard to encapsulate a story like this into one sentence. So yeah, Jason, do you want to just kind of give us like a really quick, just like top line of what the ruling said? It's based on the books. Three case, right?
Jason Kebler
Yeah. So three authors sued Anthropic, which makes Claude the AI tool, basically saying that Anthropic trained on their books without consent and without compensation. And so it's a copyright lawsuit about whether that is legal or not. There's like, I've seen a tracker, I think there's like over a hundred cases like this where it's, you know, authors, artists, you know, journalists, newspapers, New York Times, et cetera, suing an AI company about this question of whether you can train on someone's work without paying them, without permission, et cetera, et cetera. And this is the first like major decision among these lawsuits. So it's pretty important for that reason because it's like one of the first ones. And basically the judge decided that it was fair use for Anthropic to train on these authors books, but that it was not legal for them to pirate the books to do so. And so the actual like pirating of the books, not fair use training of the AI transformative under fair use, which is like if you're transforming the work in some way, then then it's protected by fair use, broadly speaking. And basically the judge found like training the AI was transformative, but, but downloading these books was not. And then there's caveats upon caveats from there. But that's like top line, sort of like what was decided.
Sam Cole
Yeah, I feel like if you went back in time five years and explained if you said exactly that to me, I would be like, yeah, obviously, like that's, that's not a difficult concept to understand. Like piracy has been illegal for a very long time. So acquiring things via piracy to further your company's business is illegal. But because this decision is coming after all of these companies have already sucked up all this copyrighted work via piracy in a lot of cases without permission, without consent from the authors in this case, and then gone ahead and used it to train their AI and their LLMs, it's like we're kind of working, we're putting this together backwards and all of this copyright law is being like shook up and re litigated and maybe even litigated for the first time in a lot of cases. So it's like, that's why I said It's a complex situation, but it's not. It's like, yeah, these are very basic principles that we're now, like, arguing about again, is theft is bad.
Jason Kebler
Yeah, yeah, I agree. And then also, I think the initial reaction from people was like, oh, this is very bad for authors. This is very bad for artists. And that was my initial reaction. And then, like, as I was writing the story, as we were talking about it, and then especially after I published it, I was like, wait, maybe this is actually, like, quite a good decision perhaps.
Sam Cole
Yeah. Can you walk us through that? Like, why is this. What is this? What does this mean? I guess if. Yeah, what could it mean? If anything?
Jason Kebler
Yeah. So again, there's, like, hundreds of cases. And so the judge in this case had to look at the facts of this case. And in this case, Anthropic used books three, which is this massive, like, torrent of books, Libgen and one other piracy site called Pirate Library Mirror to download hundreds of thousands of books. And the judge essentially, and then crucially, Anthropic made a, like, library slash database of all of these. So not only did it, like, download all this information for training purposes, it kept a copy of all of them. And keeping the copy of them is sort of like what the judge said was bad. But then also, alongside of this, Anthropic started buying physical books, like, physical used books from, like, mass resellers. And then it started scanning those books in the same way that, like, Google Books was scanning books in the same way that, like, the Internet Archive has done before as well. And basically the judge said for the books that it bought legally, like, all good, no problem here. For the ones that it pirated, not good. Going to go to trial, gonna, like, look into damages and stuff like that. So basically, like, there's these instances of piracy that Anthropic might now have to pay for. And sort of like, at first blush, it's like, oh, that's. That's good for these three authors. Like, what does it mean for the rest of us? Like, everyone else that has had their stuff, you know, kind of taken. And it seems like on first blush, it's like, oh, okay, they can train their AI on this stuff as long as they acquire it legally. But then you think about it, and it's like, almost very little of what AI companies have acquired has been acquired legally. We've reported various times. Sam, you had an amazing story about Nvidia, and we did one about Runway as well. You did one about Runway as well. And it's like, they're scraping a lot of piracy sites. These companies are scrap scraping piracy sites. They're scraping like Netflix, they're scraping like stuff that is paywalled that they are not allowed to do. And crucially, the judge was like, that's illegal. That's very bad in this case. And so when you think about it in terms of like, practical practicalities, it's like a lot of these AI companies have committed like millions of instances of piracy and the potential punishment and liability for that is like billions and billions of dollars. And so that's why I think it's maybe not as bad of a decision as I first thought, because, like, these AI companies very well could be on the hook for millions of instances of piracy. And then the other thing that is really notable about this is the authors didn't allege that Claude reproduced their work in any way, meaning they didn't talk to Claude and get Claude to spit out portions of their book verbatim. And a lot of the other lawsuits hinge on outputs like that where like ChatGPT, New York Times vs OpenAI hinges on ChatGPT verbatim reproducing parts of New York Times articles, for example. And that, that didn't happen in this case. And the people suing didn't even allege that it happened. They were just like our books were in Books three. We know you trained on Books three, therefore, you know, we're suing you. And the judge specifically sort of said, well, they didn't allege that the AI was reproducing their books, therefore I'm not even going to consider that. And I think that that's a pretty crucial fact to keep in mind. And that in these other cases the AI tools are regurgitating pretty much verbatim or very similar copyright characters. For example, in terms of the mid Journey, Disney versus Midjourney lawsuit, things like that. And so I don't think that it's going to be like blanket, yes, you can train AI on anything you want. I think that the outputs are actually going to matter for some of these other cases.
Sam Cole
Yeah, you kind of have to get, you kind of have to read this and get into like copyright and IP lawyer and IP Judge Brain, because he's looking very specifically at what the law is. Right. So like, is what they're. Like you said what they're alleging is not that there are these like perfect reproductions. It's that their copyright was violated because it was ingested illegally. It was acquired illegally. It was acquired to piracy. Yeah. And the Disney and the Disney versus Midjourney one is like, oh, you reproduced Mickey Mouse perfectly. This is a threat to our copyright. So obviously you need to pay for that or pay in court or some other way. And we did a story yesterday, I guess it'll be two days ago until Monday. Whenever this comes out for you about how One of Meta's LLMs was, apparently researchers found and think that it was memorizing huge pieces of books and would actually output big chunks of the first Harry potter book or 1984, which is a very clear sign that those pieces are in the LLM as that contiguous chunk of the book. But it's a different case than what's happening in here with Claude and Anthropic, which I think is really interesting. So yeah, you also made a good point on Blue sky. You were writing and you had mentioned this also in Slack, um, that this is interesting for like paywalled sites. Do you want to get into that a little bit? Like if Anthropic got a, a subscription to Four4Media, for example?
Jason Kebler
Yeah, I mean I guess it's like a lot of the, a lot of the deals that like media outlets are signing with OpenAI have to do with like OpenAI giving a company millions of dollars a year to train on their, to get permission to train on their content in an ongoing fashion. And sort of like the way that I read this and you know, I am not a copyright lawyer, I'm not a lawyer of any sort, but the logic and reasoning sort of suggests to me where it's like if you buy a book, you can then train on that book. I don't know why you wouldn't be able to buy a subscription to a website and be able to train on everything. And so instead of paying a news outlet millions of dollars a year for like a really big news outlet, a really big deal, you know, can you get one subscription and train on that and, and is, is that going to sort of like open, open this up even further? I think the other thing I want to at least give voice to is that there's a lot of copyright experts out there who say that copyright is not the correct way to sort of go about policing this. I think that it's the most obvious one at the moment. But that like a very restrictive fair use decision on like what is fair use and what is not would have probably like pretty scary knock on effects for people like us who do lot use. Lots of things under fair use, like use sometimes images as fair use, sometimes use like snippets of video as fair use, things like that. I don't want to get too into it because I think I'd get out of my depth, but we're going to see tons and tons of copyright decisions. But copyright law wasn't really written anticipating this sort of thing happening.
Sam Cole
Yeah, for sure. It's a new frontier to be cliche about it. So what happens now in this big ongoing case? What are we going to see occur next or what should we watch for?
Jason Kebler
So basically this was a summary judgment, which means the judge threw out the parts of the case that he decided were fair use and the parts where he thinks that it was not fair use. There will be a trial. And so the trial will, I guess, give Anthropic the opportunity to argue actually this creation of this massive library using piracy was fair use for some reason. Although the judge in this case seems like really reticent to accept any sort of argument there. He's like, it pretty plainly fails fair use. It's just like straight up piracy. So that'll happen and then they'll decide damages and then like, of course there's opportunity to appeal. I think more important, importantly, there are just like a hundred other cases that are, that have different facts, that are going to go through different courts and eventually some of these are going to get appealed up probably to the Supreme Court at some point, just because there's like so much money at stake and there's so many cases going on right now. So I don't think that this is like precedent setting in any way at the moment, but this is like the first really big decision so we can like see what types of things judges are considering here.
Emmanuel Mayberg
I have a question and I. I don't necessarily think any of us can have the answer for it, but I don't know. It's something I want to talk about, which is, I think especially back when we first started to report about what is included in the training data for these very popular models for these giant AI companies. Specifically when Sam published those stories about Runway and Nvidia, my feeling then was like, oh, no, the other shoe has to drop for the entire generative AI industry. Because their story changed from, oh, this is trained on the open Internet. And then people started to ask, is there copyrighted material in there? And literally they shrugged. Like, famously, The I think CEO of OpenAI at the time was like, maybe YouTube is in there, maybe it's not. I don't know. Maybe there's copyrighted material in there, maybe there isn't, I don't know. And now Somehow we've shifted to like, well, obviously there's copyrighted material in there, obviously there's pirated material in there. But now we're making some fair use argument. And I guess I'm at a pretty nihilistic point where it's like a too big to fail situation. Like, it's hard for me to imagine any legal action undoing the training or forcing these AI companies to redo their training or to pay every single person whose work has been used to train these giant, extremely profitable AI models. Jason, I'm at a pretty. I'm not saying it's right, I'm saying it's horrible. But I'm at a pretty nihilistic place with it where I just don't think there's. I don't think the legal system is equipped to punish or correct such brazen and huge IP theft, right, like, on this scale. And I'm just not sure there's anything to do to correct. What do you think? Do you think they actually end up paying billions of dollars or.
Jason Kebler
I mean, I am sort of with you. I'm sort of with you where it's just like every major tech company is doing this in such a brazen way. I want to read you like an excerpt from the. The judge wrote this. The judge wrote this. In this decision, he said, quote, from the start, Anthropic had many places from which it could have purchased books, but it preferred to steal them to avoid, quote, legal practice, business slog as co founder and chief executive officer as. As Anthropic put it. So In January or February 2021, they downloaded Books 3, an online library of 196,640 books that he knew had been assembled from unauthorized copies of copyrighted books that is pirated. The judge wrote Anthropic's next pirated acquisition involved downloading distributed reshared copies of other Pirate libraries. In June 2021, Mann downloaded in this way at least 5 million copies of books from library Genesis, which he knew had been pirated. And in July 2022, Anthropic likewise downloaded at least 2 million copies of books from the pirate library Mirror, which Anthropic knew had been pirated. So the judge in this case was like, you, who downloaded 7.5 million books illegally. Mixed decision. Like, he's kind of just like, yeah, they stole a shitload of things. And I know that we're like, many years out from like, teenagers downloading 17 songs on Napster and having to pay Metallica hundreds of thousands of dollars, but it's like, Anthropic is one of the most valuable AI companies in the world. Here's a judge saying, yes, they stole millions of books. And it seems like maybe they'll get a slap on the wrist here. I don't know. And I think that Congress is certainly not going to do anything. This current administration is certainly not going to do anything. I think that individual court, courts might have some, like, civil penalties. I think some of the bigger lawsuits are going to get settled. Like we've already seen. I forget which one, but I think it was the Suno case. The AI music case, like Universal is essentially settling with them and signing a licensing agreement with Suno. And so the like super big entities, you're like, Universal Musics, your New York Times is your Disney's, etc. I get the sense that they are going to eventually sign some sort of deal with these companies. These smaller authors who are like banding together as classes to sue against them are going to lose or people are going to get $12 in some class action lawsuit settlement. And like the industry is going to keep going. Like, I don't see a world in which this becomes an existential issue for these AI giants because there's too much riding on it. Like the entire tech industry, which is a huge part of the American economy, is riding on this succeeding in some way. And I guess when this all first started, I was like, oh, they're going to get sued and this is all going to go away in some way. And I guess I don't feel that way anymore. But it sounds like that's kind of like where you are as well.
Emmanuel Mayberg
I think it's that and I think it's the scale, the thing that everyone always, you know, tweets at me or post on Blue sky whenever we post, one of these stories is like, oh, this is allowed. But Aaron Schwartz, who was like an early, it's fair to call him founder of Reddit, and he, he was about to go to jail for making a bunch of copyrighted papers, I think, available to the public and took his own life. And it's such an obvious point to make, but I think it's true. It's like if, if, if Comcast snitches on you for downloading a pirated copy of the Avengers, they're like, this guy has committed a crime. Aaron Schwartz has like broken copyright law and he's going to jail. But if you steal from more people that you can count more times than you can count and kind of mix it all into like this Slurry, where it's hard to tell who you stole from. And the responsibility of the people you stole from is to come up to you and say, like, hey, that's actually my condom. It's like, it's just. It's a crime on such a scale that you can't really do anything about it. And the system is not built to deal with it.
Jason Kebler
You also go into, like, meta AI app right now and scroll through the Discover page, and there's like, Mickey Mouse, SpongeBob, Mickey Mouse, SpongeBob, Spider Man. Like, it's so obvious. It's so brazen. And, you know, I guess they're just arguing transformative.
Emmanuel Mayberg
But, like, many entities have argued transform, transformative, better, and lost. And, you know, it's. That. That's not why they're going to get away with.
Jason Kebler
And I mean, it is interesting that, again, like, a year ago, these companies were saying, oh, we don't. We don't know if we trained on pirated material. Like, very unclear. Could be anything. Could be just open web. Could be. These AI tools are magic. And, like, when you ask for Mario, Mario pops out. Because they're so smart that they're figuring out how to do Mario, and it turns out like, no, they're just downloading Netflix, they're just downloading YouTube. They're just downloading every book that has ever existed. Like, that's one of the quotes in here, is that Anthropic has tried to train on every book that has ever existed. Like, that was their goal. And, yeah, it's just like their. Their argument now is not we didn't do it, or you can't prove we didn't do it. Their argument now is, we did it and it was legal, and there's nothing you can do about it. And we'll see how that plays out. But, you know, this decision is. Is it's mixed. It's mixed. I don't think it's as dire as a lot of people were saying when it first came out, but it's certainly not, like, cut and dry. You can't do this. It leans more on the, like, yeah, they can get away with this. And I think that except for in a few really egregious cases, that's probably what we're gonna see moving forward.
Joseph
Yeah, I was kind of surprised by how complicated it was. Like, I thought it was gonna fall on one side or the other, but it lands in this weird middle spot. But I know, I guess we'll see how it continues and what happens if it gets to trial. If you're listening to the free version of the podcast, I'll now play it out. But if you are a paying 404 media subscriber, we're going to talk about all the AI slop during the Iran Israel war. You can subscribe and gain access to that content at 404 Media co. As a reminder, 404 Media is journalist founded and supports by subscribers. If you do wish to subscribe to 404 Media and directly support our work, please go to 404 Media co. You'll get unlimited access to our articles and an ad free version of this podcast. You also get to listen to the Subscribers only section where we talk about a bonus story each week. This podcast is made in partnership with Kaleidoscope. Another way to support us is by leaving a five star rating and review for the podcast that really helps us out. Here is one of those from Ann Betty. I hope I got that right. It's Quite a long one. Team404Media's reporting is incredibly thoughtful, thorough and important. Some days it's AI slop, some days it's about real time, real life, Black Mirror, Skynet world that we all live in. Now. At the end of every episode they do a bonus story from the week. I became a subscriber after being unable to shake the strong sense of FOMO. Thank you so much. This has been 404 Media. We'll see you again next week.
The 404 Media Podcast: "This Site Unmasks Cops With Facial Recognition"
Release Date: June 25, 2025
In the opening segment, Joseph introduces the episode, highlighting an upcoming week-long hiatus for the 404 Media team—a first since their launch in August 2023. He mentions potential reruns and exclusive interview content slated for subscribers during their absence.
Joseph and Emmanuel Mayberg delve into the story of FuckLAPD.com, a newly launched website designed to identify LAPD officers using facial recognition technology.
Joseph recounts his first encounter:
“I saw somebody on TikTok using it, points their iPhone into the face, brings up the guy, the cop's name, reads out his salary... and the cop starts laughing a little bit.” (02:10)
Emmanuel shares his discovery via a subreddit focused on monitoring aggressive ICE crackdowns and raids:
“They were just sharing it as a possibly useful tool.” (01:46)
Jason Kebler initially mistook the tool as a recollection from a year prior, reflecting on similar tools like "Watch the Watchers" from 2023.
Emmanuel provides an in-depth explanation of the website:
“Fucklapd.com is a site where it has a very simple interface. All you can do is upload a single image of a LAPD officer's face and it will use facial recognition tech and a database of, from what I can tell, most of the LAPD workforce to automatically identify which police officer it is, pull up their name, badge number and then also yes, their salary.” (03:21)
The tool was launched by artist Kyle McDonald in response to heightened tensions during anti-ICE protests in LA, aiming to enhance transparency and accountability amidst reports of police misconduct.
Kyle's Motivation:
“We deserve to know who is shooting us in the face, even when they have their badge covered up.” (07:28)
Jason draws parallels between FuckLAPD.com and the earlier "Watch the Watchers" initiative:
“Watch the Watchers is a database of officers... billed as counter surveillance.” (08:12)
Emmanuel clarifies that while both tools serve similar purposes, their data sources differ. "Watch the Watchers" is part of the Stop LAPD Spying Coalition, relying on public records and requiring legal action to compile their database.
Emmanuel conducted a test using a screenshot from an LAPD press conference:
“Within a few seconds, it pulled up nine results... the first one was the correct one. So it worked perfectly in this case.” (10:08)
However, limitations are acknowledged:
The conversation shifts to the potential of similar tools for identifying ICE officers, especially given the prevalence of masked enforcement actions.
Joseph highlights the practical advantages:
“Rather than searching a badge number, you search a face... easier just to point a camera at somebody and then get their photo.” (09:37)
Emmanuel discusses the dire need for such tools amidst rising incidents of masked ICE officers involved in violent enforcement actions:
“ICE officials are covering their faces... a tool like this could help people figure out... at least file a lawsuit or contact the agency.” (12:17)
Jason raises concerns about the practicality of using such tools in volatile situations:
“If you're like, let me take a picture of you and look you up, like while you're trying to arrest me is like, it's just a very, very volatile situation.” (20:37)
Emmanuel echoes the skepticism regarding the legal system's capacity to handle large-scale copyright and identification issues:
“It's a crime on such a scale that you can't really do anything about it. And the system is not built to deal with it.” (21:19)
Transitioning to a complex legal issue, Jason Kebler discusses a landmark case involving Anthropic, an AI company, and three authors challenging the legality of training AI models on their copyrighted books.
Three authors sued Anthropic, alleging unauthorized use and lack of compensation for training their AI tool, Claude, with their books.
“The judge decided that it was fair use for Anthropic to train on these authors' books, but that it was not legal for them to pirate the books to do so.” (27:03)
The judge's ruling segmented the legality based on how the books were acquired:
Fair Use for Legally Obtained Books:
“Training the AI was transformative.” (28:40)
Illegal Acquisition Not Protected:
“Downloading 7.5 million books illegally... straight up piracy.” (42:14)
Sam Cole reflects on the paradox of basic legal principles resurfacing in the context of advanced AI technologies:
“Is theft bad... we're putting this together backwards.” (29:47)
Jason analyzes the broader impact:
“A lot of these AI companies have committed like millions of instances of piracy... potential punishment and liability for that is like billions and billions of dollars.” (30:08)
He expresses doubt about the legal system's ability to contain the issue, predicting minimal impact on major AI players:
“The industry is going to keep going. I don't see a world in which this becomes an existential issue for these AI giants because there's too much riding on it.” (34:49)
The discussion concludes with reflections on ongoing and future legal challenges:
Jason anticipates numerous similar cases escalating to higher courts, potentially reaching the Supreme Court, due to the vast financial and societal stakes involved.
Emmanuel underscores the inadequacy of current copyright laws to address the complexities introduced by AI, hinting at a likely absence of effective legal remedies.
“If you steal from more people that you can count more times than you can count... it's just a crime on such a scale that you can't really do anything about it.” (38:48)
Conclusion: The 404 Media team expresses a pessimistic outlook on the ability of the legal system to effectively regulate AI's use of copyrighted material, emphasizing the need for updated legislation to address these modern challenges.
This episode of The 404 Media Podcast provides a critical examination of emerging technologies' role in societal accountability and the evolving legal landscape surrounding AI and intellectual property. Through engaging discussions and expert insights, Joseph, Sam, Emmanuel, and Jason shed light on tools like FuckLAPD.com and landmark legal cases, offering listeners a comprehensive understanding of these pressing issues.
Subscribe to 404 Media for exclusive content and support independent journalism at 404media.co.