Post Reports – The Quest to ‘Destructively Scan’ All the World’s Books
Date: January 29, 2026
Host: Martine Powers
Guest: Will Oremus, Technology Reporter at The Washington Post
Episode Overview
This episode investigates Anthropic’s covert initiative known as Project Panama, an effort to physically buy, scan, and digitize millions of books to train its AI chatbot, Claude. Drawing on new details from unsealed legal filings, host Martine Powers and tech reporter Will Oremus delve into the ethical, legal, and cultural implications of “destructively scanning” printed books. The conversation also situates this project in the broader context of the AI race and ongoing lawsuits over copyright infringement, exploring what’s at stake for authors, tech companies, and the future relationship between AI and human creativity.
Key Discussion Points & Insights
1. Revelation of Project Panama
- Background: Anthropic, the AI startup behind Claude, initiated Project Panama in 2024 — their attempt to “destructively scan all the books in the world” by buying physical books, slicing off their spines, scanning every page, and recycling what's left ([00:31–02:22]).
- Quote: “It would take them to a scanning center, it would slice off the spines, scan every page one by one, feed it into this digital library.” – Will Oremus [01:02]
- The project details surfaced in a major copyright lawsuit filed by book authors in 2023 (settled for over $1 billion), revealing Anthropic’s quest and its willingness to push legal and ethical boundaries.
2. How Project Panama Worked
- Anthropic hired Tom Turvey (formerly of Google Books) to spearhead the operation.
- Rather than licensing rights from publishers, the company bought used books in bulk from warehouses like Better World Books, making the process logistically and financially feasible.
- Quote: “You could buy hundreds of thousands of books at a time in bulk for the cheapest possible price.” – Will Oremus [06:36]
- Rather than licensing rights from publishers, the company bought used books in bulk from warehouses like Better World Books, making the process logistically and financially feasible.
- The books were destructively processed: spines sliced, pages scanned, and materials recycled.
- Quote: “It was actually very neat. It was more like a paper shredder than a kid tearing up pieces of paper.” – Will Oremus [08:19]
- The scope was vast: millions of books bought, scanned, and digitized — including attempts to acquire rarities and outreach to places like The Strand bookstore in NYC ([09:02]).
3. Ethical and Legal Tensions
- The paradox: While destructive, bulk book scanning was viewed as “more ethical” than simply pirating digital copies, as was common among other AI giants (Meta, OpenAI, Google).
- Anthropic and others, prior to Project Panama, had downloaded shadow libraries (unauthorized digital repositories) via torrent sites — sometimes with internal employee discomfort ([11:48–14:08]).
- Quote: “‘Torrenting from a corporate laptop doesn’t feel right.’” – Meta engineer, cited by Will Oremus [17:07]
- Anthropic and others, prior to Project Panama, had downloaded shadow libraries (unauthorized digital repositories) via torrent sites — sometimes with internal employee discomfort ([11:48–14:08]).
- Legal gray area:
- Buying and destroying used books to digitize them to train AI skirts the edge of copyright law.
- The legal concept of “fair use” is pivotal—training AI on books transforms them (books aren’t simply reproduced or sold directly), which judges have (so far) sometimes considered fair use in lawsuits ([14:24–16:05]).
- Quote: “They were taking the books and transforming them into something else… Claude is a fundamentally different product from a book.” – Will Oremus [14:24]
- But this is unsettled territory: Different judges have issued nuanced and sometimes conflicting decisions, and future rulings will shape what AI companies can do, and whether and how authors are compensated.
4. Why Books? – The Value of Literature for AI
- Books represent a unique, high-quality resource for AI because of their editorial rigor and narrative cohesiveness, contrasting with the “cruddy,” unreliable content that dominates the open internet ([21:58–24:12]).
- Quote: “There’s stuff that’s written by bots for bots. There’s stuff that’s fake news and propaganda...” – Will Oremus [23:12]
- Anthropic, an underdog in the AI space, saw acquiring extensive, high-caliber book content as a competitive advantage over larger companies.
5. Broader Impact on Creators and Content Industries
- Authors, artists, publishers, photographers, and filmmakers are increasingly suing tech companies for alleged copyright infringement, uneasy with the notion that their life’s work is feeding AI products without permission or compensation ([04:43], [25:22]).
- Authors’ chief objection: If AI companies value their work so highly, why not pay them for it?
- Quote: “Their objection to this practice is that they’re not seeing any of that value.” – Will Oremus [25:01]
- News organizations are beginning to strike licensing deals — The Washington Post, for example, is mentioned as having an arrangement with OpenAI ([25:21]).
- Court decisions are shaping industry norms (e.g., judges asking for proof AI impacts book sales), but final outcomes remain uncertain due to rapid technological change and unresolved legal precedents ([26:58–29:21]).
Notable Quotes & Memorable Moments
-
“Destructively scan all the books in the world sounds like something that a James Bond villain would be trying to do.”
– Will Oremus [05:42] -
“What you’re describing here almost strikes me as like a reverse Noah’s Ark for books. Like you take one of each book… but instead of saving the book, it’s like the book is sort of sacrificed to the recycling gods in the pursuit of AI.”
– Martine Powers [09:57] -
“In a way, we are saving this… we’re making sure that the content of these books… we are saving these books… for a world where AI does everything.”
– Will Oremus (speculating on tech industry perspective) [10:21] -
“There’s stuff that’s written by bots for bots… the quality of any given Internet content is not very reliable.”
– Will Oremus [23:12] -
“Authors should be proud of what they’ve produced… I think their objection… is that they’re not seeing any of that value.”
– Will Oremus [25:01] -
“The idea of hoovering up all the world’s knowledge and putting it into AI models is just not something that the original copyright laws had explicitly anticipated.”
– Will Oremus [29:14]
Important Timestamps
- [00:31] – Introduction to Project Panama and Anthropic's ambitions
- [01:33] – Physical destruction of books as a metaphor for creators’ copyright concerns
- [03:23] – How Will Oremus and The Post discovered the project via less-redacted legal filings
- [06:36] – The hiring of Tom Turvey and pipeline from used warehouse to digitization
- [08:19] – Detailed account of book destruction and materials recycling
- [09:57] – The “reverse Noah’s Ark” metaphor for the project’s approach
- [11:48] – The contrast between destructive scanning and outright piracy via shadow libraries
- [14:24] – Legal distinctions between personal use and AI training consequences
- [17:07] – Internal employee concerns about piracy at big tech companies (Meta)
- [21:58] – Why books, not just internet content, matter to AI
- [29:14] – The law’s struggle to keep up with unprecedented AI practices
Tone and Takeaways
The discussion is inquisitive, thought-provoking, and laced with both wit and unease. There’s a clear sense of awe at the scale and audacity of AI companies, skepticism about their ethical frameworks, and empathy for creators left out of the new digital value chain. Will Oremus presents complex legal and technical arguments in accessible, relatable language; Martine Powers injects metaphors and skepticism that make the stakes and strangeness of the story vivid to listeners.
This episode is essential listening for anyone interested in the intersection of technology, law, and the future of creative work — it’s a window into how AI is already reshaping the world’s literary and cultural legacy, one (destroyed) book at a time.
