The Mark Cuban Podcast
Episode: Anthropic vs. Copyright Holders
Date: September 15, 2025
Episode Overview
In this episode, Mark Cuban discusses the recent $1.5 billion settlement between AI company Anthropic and a collective of writers over copyright infringement. Cuban unpacks the landmark nature of the case, reactions from both the tech and writing communities, and the wider implications of the decision on the future of AI and copyrighted content. The conversation offers listeners a nuanced look at how AI companies train their models, connect it to the broader legal landscape, and considers the practicality and fairness of various approaches.
Key Discussion Points & Insights
1. The Historic Settlement
- Description: Anthropic agreed to pay $1.5 billion to settle with writers whose works were used without permission.
- Scope: Estimated 500,000 writers are eligible, with each receiving about $3,000.
(05:40) - Largest in US Copyright Law: Cuban notes it’s "the largest payout in the history of US copyright law."
(06:01)
2. Divided Reactions: Writers vs. Tech
- Discontent Among Writers: Cuban references criticism from TechCrunch:
“Screw the money. Anthropic’s $1.5 billion copyright settlement sucks for writers. So, I'll put it out there. Not everyone is happy about this, but for AI companies, this is, you know, positive for the industry.”
(01:42) - “Another Win for Tech”: Summing up the prevailing feeling among many writers as "yet another win for tech companies."
(07:00)
3. How Anthropic and Other AI Companies Gather Data
- Initial Data Scraping: All leading AI companies “basically scraped the Internet at the very beginning” as “everyone wanted to get as much data as you possibly can to train the models.”
(07:20) - Book Sourcing: When online text ran out, AI firms turned to books. Not all books are available online or through projects like Google Scholar.
- Pirated "Shadow Libraries":
“What Anthropic ended up doing was...they went to a bunch of pirated sources. They're called, quote, unquote, shadow libraries. So there's millions of books in there and they're pirated sources.”
(08:40) - Impact: The use of pirated literary works is credited with making “Claude...way better...for writing...the tone’s way better” than competitors.
(10:15)
4. Attempt to Rectify: Buying Books at Scale
- Once aware of legal risks, Anthropic started mass-purchasing physical books:
“They went and bought one of like every book in the world. Like something crazy, right? ... Then they basically had a robot that would take each of these books, would flip through the pages, scan the pages, and then transcribe the pages ... and then include that into the model data training.”
(12:15) - Legality: The judge ruled that training AI on books the company legally purchased is akin to a human learning from books and was allowed under fair use, but using pirated materials wasn’t.
5. Legal Precedent and the Fair Use Doctrine
- Ruling: Federal judge William Allsupport sided with Anthropic, declaring it legal to train AI on copyrighted materials if the works are purchased, due to the use being “transformative” under fair use.
(22:05) - Quote from Judge William Allsupport:
“Like any reader aspiring to be a writer, Anthropics LLMs train on open work...not to race ahead and replicate or supplement them, but to turn a hard corner and create something different. The piracy obviously was a completely different problem.”
(22:18) - Anthropic’s Perspective:
Aparna Sridhar, Deputy General Counsel at Anthropic:
“Today’s settlement, if approved, will resolve the plaintiff’s remaining legacy claims. We remain committed to developing safe AI systems that help people and organizations extend their capabilities and advance scientific discovery and solve complex problems.”
(24:16)
6. Practicality of Ongoing Writer Compensation
- Technical Barriers: Cuban highlights the complexity of perpetual royalties:
“It's impossible to know...like what data was used to create that image, like what was needed. So it's not like you could do it like Spotify, where if you listen to a song...now you give them a couple cent.”
(31:21) - Comparison to Adobe’s Model: Adobe Firefly pays photographers for images in the training set, but the attribution for outputs is too complex for text and other forms due to the blending of data.
7. Future Implications and Unresolved Issues
- Not the Last Lawsuit: Dozens of companies, including Meta, Google, OpenAI, and Midjourney, face similar lawsuits.
- Precedent Value: Cuban predicts this case (referring to “Barts versus Anthropic”) will serve as a precedent.
- Limited Recourse for Writers: The wide dissemination of works through shadow libraries means “the cat’s out of the bag” and permanent, meaningful removal or compensation is “kind of too late at this point.”
(18:40) - Final Note on AI’s Social Value:
“If we all agree that these AI models are more useful for us than harmful, let’s just move forward.”
(35:20)
Notable Quotes and Memorable Moments
- TechCrunch Criticism: “Screw the money. Anthropic’s $1.5 billion copyright settlement sucks for writers.” (Reference, 01:42)
- On Model Training: “All the writers I know use Claude because, like, yeah, the tone’s way better. And that’s because they grabbed a pirate copy of every single book.” (10:15)
- On Precedent: “I think Anthropic is going to come out ahead for this. And I think a lot of people are happy with the precedent because now they...know the right way they can do this.” (21:40)
- On Fair Use Ruling: “Like any reader aspiring to be a writer, Anthropics LLMs train on open work...to turn a hard corner and create something different.” (Judge William Allsupport, 22:18)
- On Dataset Perpetuity: “Once you’re included in a dataset, now all of a sudden you can use that model to spit out more outputs that other models can use to train on...it’s just really, it’s lost.” (33:45)
Timestamps of Key Segments
- Settlement details and scope (05:40–08:30)
- Shadow libraries and model improvement (08:31–11:12)
- Anthropic’s mass purchase and scanning of books (12:13–15:10)
- Federal court ruling and quotes (22:05–24:30)
- Precedent and implications for future lawsuits (21:40–28:10)
- Debate over ongoing royalties & technical challenges (31:00–35:00)
- Cuban’s closing thoughts on AI’s role in society (35:01–end)
Episode Character & Tone
- Original language and tone: Conversational, direct, candid, with real-world metaphors and analogies to simplify legal/technical complexity.
- Perspective: Cuban leans pro-technology and pragmatic, while acknowledging the moral and professional concerns of content creators.
This episode offers a well-rounded breakdown of a landmark AI copyright settlement, balancing the excitement around innovation with the real grievances of creative professionals. It’s essential listening for anyone interested in tech, law, or the future of creative work in the AI era.
