Summary6 min read

The 404 Media Podcast

Episode: "How Wikipedia Will Survive in the Age of AI (With Wikipedia’s CTO Selena Deckelmann)"

Date: January 19, 2026
Host: 404 Media Team | Guest: Selena Deckelmann, CTO of Wikimedia Foundation

Episode Overview

This episode celebrates Wikipedia's 25th birthday by exploring how one of the world's most important online encyclopedias plans to continue thriving amidst the rise of generative AI. Host(s) from 404 Media are joined by Selena Deckelmann, Chief Product & Technology Officer at the Wikimedia Foundation, to discuss Wikipedia’s unique model, community governance, resilience, and the challenges and opportunities AI brings to the platform.

Key Discussion Points & Insights

1. Defining Wikipedia: Website, Community, and Process

[00:00–04:20]

Wikipedia as a Project: Selena emphasizes Wikipedia as more than a website—it's a massive collaborative effort with 250,000 editors, 65 million articles, and over 300 languages, serving 15 billion page views per month.
"The Edit Button": The ability for anyone to edit ("a webpage with a button that just says edit") is central and surprisingly sustainable due to community-driven checks and balances.
Quote:
"When I think about what it is, it's that button, it's that edit button that anyone can press. It's all of these people who have worked together to produce the corpus of knowledge ..." (Selena Deckelmann, 03:12)

2. Community vs. Foundation: Roles and Relationship

[04:20–08:24]

Editors & Foundation: Editors self-organize and create content, while the Wikimedia Foundation provides technical, legal, and infrastructure support.
Distinct, Yet Overlapping: The lines between community (content creation) and foundation (support systems) are clear but collaborative, with tens of thousands of technical volunteers as well.
Notable Example: "Project Tropical Cyclone" organizes real-time editorial coverage during disasters.

3. Communication & Governance: Open, Complex, and Human

[08:24–13:53]

Thousands of Channels: Communication happens across IRC, Telegram, Discord, mailing lists, and more.
Advisory Structures: The Product and Technology Advisory Council brings experienced volunteers together to advise on infrastructure and process improvements.
Conflict Management: Open systems foster frank discussion but also conflict; research indicates that cross-perspective collaboration (e.g., editors of differing political backgrounds) leads to greater neutrality over time.
Quote:
"...over time when folks [with opposing political views] do that, their editing becomes more neutral... as a result of being exposed to these different ideas." (Selena Deckelmann, 12:23)

4. Human Judgment & Neutrality in Content

[13:53–15:17]

Editorial Groups as Clearinghouses: Groups like the Cyclone editors play crucial roles in vetting information during emergencies.
Reliability Amid Shifting News: Wikipedia's editor base routinely sifts through conflicting reports to provide accurate information.

5. Wikipedia and AI – Principles and Concerns

[15:17–23:24]

Human-Led Ethos: Despite integrating some machine learning, Wikipedia remains fundamentally a human-led project.
Sustainability & Reciprocity: Selena stresses the web’s original system of reciprocal links, now challenged by AI that doesn't inherently link back to sources.
Potential Crisis: AI systems, particularly LLMs and chatbots, often aggregate info without supporting or referring users back to Wikipedia, threatening the ecosystem that sustains the encyclopedia.
Training Data Dependency: Studies show LLM outputs become more toxic without Wikipedia in their training data, highlighting Wikipedia’s qualitative impact on the broader information ecosystem.
Quote:
"...when you remove Wikipedia [from data], the responses [from the language model] suddenly become much more toxic. And I think that's so interesting..." (Selena Deckelmann, 22:01)

6. Commercial Partnerships & Sustainable Infrastructure

[27:59–33:47]

Partnerships with AI Companies: Wikipedia works with companies like OpenAI and Google via Wikimedia Enterprise APIs, seeking support for infrastructure rather than pure data exploitation.
Scraping Problems: Unsanctioned bots, scraping at scale (sometimes via residential proxies), create technical and sustainability challenges, sometimes amounting to denial of service attacks.
Need for Alignment: The foundation aims for partnership models that reinforce the commons and redistribute value back to content creators.

7. Machine Learning Inside Wikipedia

[33:47–37:54]

Supporting Tools: Machine learning (ML) is used for edit analysis (e.g., revert risk scores), grammar/spelling checks, citation prompts, and "peacock language" detection.
Empowering, Not Replacing Editors: The focus is on augmenting human editors’ productivity, not replacing them.
Quote:
"How can we make an individual editor that much more effective in a moment, make them more powerful?" (Selena Deckelmann, 36:02)

8. Detecting and Addressing AI-Generated Content

[37:54–42:47]

WikiProject AI Cleanup: Volunteers spearhead efforts to find and remove erroneous AI-generated content, with the foundation supporting with tooling and research.
Community-Led, Foundation-Supported: Foundation’s approach is collaborative and adaptive, providing detection tools and responding to community requests rather than unilateral intervention.

9. Governance, Attitudes Toward AI, and Internal Divergence

[42:47–49:52]

Tensions Around AI: Some editors strongly resist AI-generated content or features (like AI summaries), while others (including Jimmy Wales, co-founder) are more open to machine learning as an assistive tool.
Neutral but Practical: Deckelmann advocates for pragmatism—testing and deploying ML when it genuinely helps, but always in dialogue with the community.
Quote:
"I don't think that we should just ban this particular technology ... it's complicated ... and I think everybody sees that it's a complex situation." (Selena Deckelmann, 48:51)

10. The Future of Wikipedia & The Internet

[49:52–53:01]

"Lightning Round" Futures:
- 100 years from now? "Yes..." (Selena Deckelmann, 50:09)
- Always a human in the loop? "Yes, but..." (Selena Deckelmann, 50:16)
Broader Implications: Wikipedia's governance model is more relevant than ever, and its “magical” community-led structure could inspire more online projects.
Necessity of Transparency, Humor, and Good Faith: The key ingredients to Wikipedia’s resilience are assumed good faith, humor, and a willingness to deal with the unexpected.
Quote:
"I think more of [the Internet] can be done ... that's the way that we will produce more systems like this. And I absolutely think it's possible ... even if it's hard." (Selena Deckelmann, 52:25)

Notable Quotes & Memorable Moments

[03:12] "It's that button, it's that edit button that anyone can press ... that creates such an incredibly reliable, verifiable, transparent system for knowledge." – Selena Deckelmann
[12:23] "...over time ... their editing becomes more neutral, likely as a result of being exposed to these different ideas." – Selena Deckelmann
[19:35] "...an agentic AI system, I'm not really sure where the reciprocity is. That's the big question..." – Selena Deckelmann
[22:01] "When you remove Wikipedia, [LLM] responses suddenly become much more toxic. And I think that's so interesting..." – Selena Deckelmann
[36:02] "How can we make an individual editor that much more effective in a moment, make them more powerful?" – Selena Deckelmann
[48:51] "I don't think that we should just ban this particular technology ... it's complicated ... and I think everybody sees that it's a complex situation." – Selena Deckelmann
[52:25] "I absolutely think it's possible ... even if it's hard." – Selena Deckelmann

Important Timestamps

00:00–04:20 — What is Wikipedia? Mission, scale, and the meaning of the "edit" button.
08:24–13:53 — Communication channels, advisory councils, and collaborative neutrality.
15:17–23:24 — AI’s impact, Wikipedia’s role in LLM training data, and reciprocity crisis.
27:59–33:47 — Wikipedia's relationship with major AI companies, infrastructure strain, and sustainable partnership.
33:47–37:54 — ML inside Wikipedia: revert risk, peacock language, content assistance.
37:54–42:47 — Detecting/removing AI-generated content, ISBN fraud, and community process.
42:47–49:52 — Community vs. foundation tensions, philosophy around blanket AI bans, and practical neutrality.
49:52–53:01 — "Lightning round" on Wikipedia's long-term future and the potential for more Wikipedia-like platforms online.

Summary & Takeaways

Wikipedia’s Resilience: Its open, human-moderated model, collaborative governance, and tools for accountability continue to make it unique and essential in the AI era.
AI as Both Threat & Tool: Generative AI challenges Wikipedia’s ecosystem (especially reciprocity and sustainability), but machine learning also enhances editorial workflows.
Foundation’s Role: The Wikimedia Foundation sees itself as an enabler/facilitator, not a top-down controller, respecting diverse community choices.
Caution, Collaboration, and Hope: Wikipedia’s ability to adapt, resist toxicity, and remain neutral is tied to its community-centric approach—one that could inspire a healthier, more open future for the Internet at large.

End of summary. For more, visit 404media.co.

Loading summary

Transcript47 lines

[00:00]
A
Foreign. Welcome to the four four Media podcast where we bring you unparalleled access to hidden worlds, both online and IRL. Four4Media is a journalist founded company and needs your support. To subscribe, go to Four4Media Co as well as bonus content every single week. Subscribers also get access to additional episodes where we respond to their best comments and and they get early access to our interview series too. Gain access to that content at 44 Media co. This week we're joined by Selena Deckelman. Selina is the Chief Product and Technology Officer at the Wikimedia foundation, the nonprofit organization that operates Wikipedia. That means Selena oversees the technical infrastructure and product strategy for one of the most visited sites in the world and one of the most comprehensive repositories of human knowledge ever assembled. Wikipedia is turning 25 this month. So I wanted to talk to Selena about how Wikipedia works and how it plans to continue to work in the age of generative AI. Everyone knows what Wikipedia is, but something that I've encountered in my writing is that it's actually harder to describe comprehensively than you might think. It's a website, it's a technology, it's a community. I was wondering what is your best definition of what Wikipedia is and what do you do there?
[01:35]
B
Yeah, it's a great question. I find that too, honestly. I'm the Chief Product and Technology Officer at the Wikimedia foundation, and we support Wikipedia and a number of sister projects like Commons, Wikidata and Wiktionary. I think when I think about it, I think about the websites and the number of page views that we're serving. So we're serving like 15 billion page views a month, about 1.5 billion unique devices. And the work that I do is to support 250,000 editors who are creating and maintaining over 65 million articles across about 300 languages, more than 300 languages. And in that work, when I think about what it is that we're unique. We still have our own data centers. There's stuff that we do in the cloud, but for the size of organization that we are, we're pretty small. It's about 650 people, and my team's maybe like half of that. There's even fewer developers. We're really small for a top 10 website, and we're using a lot of open source software, mostly open source software. And some of the staff that I work with, including the people that I work the closest with, they've been around since the very beginning. So here to talk about 25 years of this project. And there are people who at the same time as they were working on racking servers for the first versions of the website, they were running paper routes and writing their theses, you know, like, and this is a global community of supporters, developers, editors, from the very beginning as well. So it's, it's just pretty incredible what that started out as. And kind of the dream of creating a webpage with a button that just says edit and letting the whole Internet click that button and being like, okay, let's see what happens with that. If you told someone you were going to do that from the very, very beginning, I don't think they would have thought that would work very well, but it has. And when I think about what it is, it's that button, it's that edit button that anyone can press. It's all of these people who have worked together to produce the corpus of knowledge that exists now. And it's the principles and processes that those volunteers have created together that creates such an incredibly reliable, verifiable, transparent system for knowledge. Yeah, that's how I would describe it.
[04:21]
A
I love that description of a website with an edit button. But I would add to that that it is a website with an edit button. But good, because a lot of the Internet is basically what you're saying, but bad. And I think something that I'll want to get into in our conversation is what has been done on your end to make sure that it's still like a productive net positive force in the world where so many others have failed. But before we get to that, something that I always find pretty interesting and complicated is the line between Wikipedia, the community led editor led project and the Wikimedia Foundation. How do you explain where one begins and the other ends?
[05:08]
B
Well, from a content creation perspective, I see a very clear line. You know, the volunteers, they self organize and they create and edit the content. And it's just an incredible system because like I said before, you know, we put up this webpage, anybody gets to click the edit button and you know, from, from the, you know, from the sound of that, it doesn't really sound like it's going to work, but it works because of the people who contribute. They're really good people. Like you said, they've worked together to create a system that's designed to improve that content and add to it over time, not be 100% perfect, you know, from the very beginning. It's designed for iterative and improvement and there's a lot of grace that goes into that and assumptions of good faith. Among the people who are choosing to contribute. You know, when I. When I think about this, I. I think about, like, the many individuals, and that's how they think of themselves as well. You know, it's not just like one community of people. It's like all of these individuals doing things, and they go by funny handles. Like, there's this guy named Hurricane Hink. He wrote an article about a typhoon that's really important to. My origin story was Typhoon Pamela that hit Guam in 1976. And he works with a community of. In a group, a wiki project that is called Project Tropical Cyclone. And they're part of this international group. When I think about one of the incredible aspects of Wikipedia, it's these different groups of people that form gigantic, basically, newsrooms that are like clearinghouses for all of the things that are being published on certain topics at any point in time. Like when Queen Elizabeth. It was apparent that, you know, she was probably going to pass away pretty soon. A group of these editors got together. They named themselves. I think the project was named London Bridge. And they got together and they made lists of all the articles that were going to change, and they were all, like, ready to, like, come and start just immediately making the changes as the news became available. It's just. It's just a very surprisingly, like, fast and accurate system for getting that information to the public, you know, in what the public feels is like a very trustworthy way. So, yeah, so that's like how I think about what the volunteers are and what they're responsible for and what they do. And from the foundation side, we're accountable for the infrastructure to support all of that. So we enable help with, like, fundraising, legal partnerships. In my case, I'm providing the technical infrastructure that enables all that global collaboration. And it's a little bit different from other platforms in that we also have a lot of volunteers who contribute to the technical infrastructure. And our guess is that Maybe there's like 50,000 of these that have contributed from the beginning. There may be more, maybe less. I don't have an exact number on that, but there's probably about 10,000 of them that are active worldwide today. And so unlike the content, though, I am ultimately responsible to. For the technical systems, including those data centers I talked about before, I guess.
[08:25]
A
Two question is, like, where? Because I know you interact, right? Like, there has to be a line of communication between the foundation and the editors. I'm wondering, where's that line of communication for you? Like, where do you interact with them? And have you ever experienced tensions between these two separate but obviously cooperative groups.
[08:48]
B
Yeah, there are many, many lines of communication. So there are some primary ones. But I want to share just how I felt when I started my job. So I've been here a little over three years, and when I started, I just went on different pages about the foundation and Wikipedia and searched for different communication channels because I was like, oh, maybe I should subscribe to a few mailing lists. Maybe I should get on some IRC channels, because I assume that people were using IRC and things like that at the time. And what I discovered is that there were thousands, thousands and thousands of channels, you know, in all sorts of communication mediums. Like Telegram is a very popular medium now. There's many discords. There's lots of mailing lists. So many, many different communication channels for, like, rapid and open and very frank communication between people. So that's like, one aspect of it. I definitely have, like, like official channels, you know. So we created this thing called the Product and Technology Advisory Council. And that's a group of, you know, longtime, very trusted volunteers and trusted in terms of their reputation in the communities, who come together with me and talk about ways that we can improve the infrastructure together. And one of the ways that we've talked about that, and when I say infrastructure, it's both the technical things, but also the ways that we do things. So, like, how do we communicate about experiments that we're running? That's been an important topic recently. And so we come together and talk to them about what would be the best way for us to do that. And they make recommendations, because the ways that people communicate on Wiki, they are. That's. That's what we call it on Wiki is like when you, like, type a message, like, into, you know, into a talk page or, you know, some other forum where you're actually, like, trying to interact with a person. Person instead of writing an article. So when we're on Wiki, there are different ways to share things. There's different forums that are more comfortable or more recognized than others. And so we talk to the other human beings to find out where those places are and what the best way it would be. What is the best way for us to introduce a new topic? Yeah, I think that pretty accurately describes the broader scope of it. And I think it's a human system and it is not without conflict. So I was thinking about before I came on, the kinds of research that has been done on Wikipedia is. There's so much. It's an incredible system because it's so open. And all of the edits that have been done by all of the different editors, they're available to look at, you know, and you can see the timestamps, you can see, like, what the person edited. You can see whether it was reverted. So there's all this, like, very, very rich information there. And one of the more fascinating pieces, pieces of research that I've come across, it was from 2017, and it was trying to study what happens when different people of different, like, political persuasion. So, like, you know, in the US if you've got a Democrat or Republican, what happens if they come together and edit on a particular topic together? And so what this research says is that over time when folks do that, when they express those political points of view on Wikipedia and then they edit together, their editing becomes more neutral over time, likely as a result of being exposed to these different ideas. I think that's really incredible because there's not very many social media platforms or Internet platforms where that is the case. The algorithms might push you more, more toward a particular point of view, but here you're interacting with other human beings and as a result, it seems like people get more moderate. And I think that's such an amazing finding. This is a very simple tool. It's an article with a talk page and then a set of policies that say that we're going to make pages about a topic not from a particular point of view, but like, as like a summary of different points of view. And that that tool, I think, has resulted in something quite special and unique.
[13:15]
A
Yeah, it's an amazing finding and it's very incongruous with the time, obviously. Yeah. And this is, you know, this is not something I want to get into, but I feel like that's an obvious reason that people on the extreme ends of the political spectrum are often upset with Wikipedia is because it is by. I don't know if by design, but the result is something kind of in the middle or kind of neutral. And obviously, if you, if you're on an extreme political end, that, that is not something you like. Sorry, I completely forgot to follow up. How is the Cyclone group related to your origin story?
[13:53]
B
They're great. They're so amazing. They still like, they do such amazing coverage. Yeah, I, they, they have multiple different projects, but there was, there was a hurricane that happened in Florida. I'm sorry, I don't have like, the, the name of it, like, top of mind. But like, last year I did a little check in on them and they were all working together. Hurricane Hank actually had like edited the page. But there were other people, you know, that had been mentored into taking the lead. But they, they respond immediately to disasters. And I think what's so helpful about that kind of work is that it's a clearinghouse for accurate information and they're helping sift through a lot of different reports that might be coming out that are not accurate or are not reliable, and they find the best and summarize it there. And I think that, that human judgment in those moments is incredibly valuable, especially right now, you know, and I think as different news sources, you know, become more or less reliable, it's helpful to have a group of people that are commenting on that. And there are other ways, you know, to find out whether different news sources, you know, are the best, like in the moment. But I do think that Wikipedia editors have done a pretty good job of keeping up with that as times have changed.
[15:18]
A
So in 2023, he wrote a great post about AI and Wikipedia. I'm going to oversimplify here, so feel free to expand on any point if you want or correct me, but my reading of it is that essentially your message was that Wikipedia is an inherently a human led project. This is something I think has already emerged from our conversation, but that Wikipedia already uses machine learning in some ways and that it may be useful in the future. So this is 2023. Given the rapid development of AI, that feels like a long time ago. So I just wanted to check in and ask, has your perspective changed at all since then, and is that an accurate summation of how you feel about AI and Wikipedia?
[16:12]
B
I think it's an accurate summation. There are some, some things that have definitely changed in that time. I went back and looked at that post, as I do pretty regularly, to see if there's. Should I substantially revise it? I think we're probably about due, not necessarily for a substantial revision, but just a reflection on what I and my colleagues thought then and what we think now. I think the heart of what I said then about sustainability, about equity, about transparency, those qualities remaining crucial to information on the Internet and the spread of knowledge on the Internet, those things are so important and that part of it hasn't changed. And I still believe that we have a sustainability crisis on our hands. And a thing that I hope for and advocate for is that companies that are creating new AI systems, some of them just incredible, you know, doing amazing things, advances in medicine and reducing toil for people who are trying to detect cancer. You know, I mean, there's, there's so many Incredible applications of the technology that are obviously like a good, you know, they're good for society. So I, I think that that is going to continue to be true. And the systems for gathering knowledge, I think, are going to continue to improve, and they already are much better than they were when I wrote that post. But the sustainability of human contributions to those systems, that is what I think we have not solved. And that's the question everyone needs to grapple with. I think Wikipedia has done a really good job, first of all, providing a mission and a purpose that people feel so good about contributing to. They want to come and share what they know, they want to collaborate with other people on doing that, and they're continuing to do it. So that's good. And then for these other systems, we have to think about what is the way that we're going to motivate people to contribute to the common good here. Because the web, when it was originally conceived, it was built on links, which is, you know, it builds that reciprocity into the system inherently. That was the web, like an agentic AI system. I'm not really sure where the reciprocity is. That's the big question, like, how are we going to think about that system? In the same way that we conceived the web. And I think it's totally possible, you know, like human beings, you know, conceived the web, conceived the Internet, you know, created this system called. Created the rules around it, created all of the enterprises around it, the nonprofit systems and the profit systems all got created on top of that. And I think this is the question for the AI age. How do we also create systems of reciprocity that are for the common good at the same time as we have these commercial systems? And I think it's a little unbalanced right now.
[19:17]
A
Say more about that. What do you think about agentic AI that kind of messes with the incentives or removes this link system you mentioned? Is it because you're locked into like, one specific LLM? Why do you think that has that impact?
[19:35]
B
Well, if we just like, look at the link system, there's an assumption that a person is going to follow those links to learn more.
[19:44]
A
Right.
[19:45]
B
And in the agentic system, that is not the assumption. You know, it's very convenient and I know why. It's compelling and it's useful. You know, I use AI systems both to, like, vibe code things, also to ask it, you know, what is the best, like, way that I can arrange my office, you know, to make it more pleasing and more, you know, whatever, how many more plants Should I add? But what is not there is. So I just gathered all of this information, but where did it come from? Sometimes, you know, that, that over time, what I have seen with, you know, chatbots in particular, I'm not saying it as much with the agents, but you see the need to share the links, you know, because people want to verify is that information correct. They want to feel, I would guess, like the, the producers of those systems want to help people feel like they can trust the, you know, the, the content that's produced. And if I compare it to what's evolved in Wikipedia over time, there's all these requirements. When you create an article, you are required to offer citations. And that is one of the ways that educators look to Wikipedia. They don't say to kids, go quote Wikipedia. They say, go to Wikipedia and look at all of those sources at the bottom of every article. Learn from those, you know, quote from those, you know, read the article to get a sense of this topic, but then go to those sources. And I, I think that's like a fundamental challenge. Like the, the reciprocity of the web is also built into Wikipedia. But you know, how, how are we going to do that? And I, I'm not here to like mandate that it's links, you know, like, I, I don't think that there's only one way to create systems like this, but I, I do think we need to put some like real imaginative thought into, to be. Because what we want is for people to continue to contribute to incredible projects like Wikipedia. We want them to be motivated to do so, so that the value, you know, of human knowledge creation continues to be part of these systems. Right? Because, you know, there's another study that I just, it's only a couple years old, this from 2023, I guess that's three years now. So there, there's, there's a study that was done about the pile. This is a source of training data and some researchers, they looked at what happens when you remove different things from the pile and how does the model respond differently? What does it produce that's different? And when you remove Wikipedia, the responses suddenly become much more toxic. And I think that's so interesting, you know, like, what is it? You know, and it'd be fascinating to dig even further into that and be like, why? You know, why? Exactly. I mean, I have my theories about why. I think it's a really well written, high quality, neutral source of information. I think that's why. But I think there's more to know, you know, about that. I think there's so much to know. I think we haven't devoted enough time or energy to this idea about reciprocity. Because in the end, like these AI models, they are dependent on human generated content to know new things. And that's either through training or through accessing new content that's produced through agents. And that's going to continue to be important no matter what.
[23:25]
C
Here's the thing about New year's resolutions. Nearly 80% of them fail by February. And it's not because people lack willpower. It's because they lack data. You can't fix what you can't see. Real, lasting change starts when you actually understand what's happening inside your body. Things like leptin resistance, thyroid dysfunction, chronic inflammation or hormone imbalances. These hidden issues can derail even your best efforts to get healthier feel better before you realize what's going wrong. That's why I chose Function Health. This year I want to focus on figuring out more about how my body processes food, what types of exercises I should focus on, and honestly, I want to know why I feel tired all the time. Function is the only platform that will actually show me what's happening inside my body. I just scheduled my test on the Function Health platform so I can figure out what's going on and I couldn't be more excited. Function Health is the only platform that gives you access to over 160 biomarkers covering hormones, metabolism, heart health, inflammation, stress markers and toxins, all tracked in one secure place. Over time, you can even add MRI and chest CT imaging, the type of a holistic look at your health most people never have access to. There's a reason why doctors like Mark Hyman, Andrew Huberman and Jeremy London are behind Function. When you stop guessing and start measuring, everything changes. I'm looking forward to seeing how my current lifestyle, diet and exercise routine are affecting important biomarkers, and I'm hoping to be able to make resolutions based on my biology and actual data. Own your health for $365 a year. That's a dollar a day. Learn more and join using my link, visit functionhealth.com 404media or use gift code 404Media25 for a $25 credit toward your membership. A New Year Colder Days this is the moment your winter wardrobe really has to deliver. If you're craving a winter reset, start with pieces truly made to last season after season. Quince brings together premium materials, thoughtful design and enduring quality so you stay warm, look sharp and feel your best. All season long, Quint has everything you need. Men's Mongolian cashmere sweaters, wool coats, leather and suede outerwear that actually hold up to daily wear and still look good. Their outerwear is especially impressive. Think down jackets, wool coats and Italian leather outerwear that keep you warm when it's actually cold. Each piece is made from premium materials by trusted factories that meet rigorous standards for craftsmanship and ethical production. By cutting out middlemen and traditional markups, Quince delivers the same quality as luxury brands at a fraction of the price. The result is classic styles. You'll love that hold up year after year. I've been wearing this 100% cotton sweater basically every day. It's stylish, held up, great in the wash, and always gets me compliments. Honestly, it's become one of my favorite pieces. Refresh your winter wardrobe with quints. Go to quints.com 404media for free shipping on your order and 365 day returns. Now available in Canada too. That's quince.com 404media free shipping and 365 day returns. Quince.com 404media this episode is sponsored by BetterHelp.
[26:41]
D
It's January. We're all back to work. The holiday season is over and boom. Anxiety is back. It's 2026. The news is what the news is, and friends are always asking me, how do you do this? How do you stare into the abyss every day? It helps to have a professional to talk to. That's where BetterHelp comes in. BetterHelp makes it easier to get an outside, unbiased perspective from a licensed therapist, someone who's there to help you identify what's weighing you down, what's scaring you, what's holding you back. Whether that's fear, doubt, pressure, the state of the world today. Their therapists are fully licensed, they work under a strict code of conduct, and BetterHelp does the matching for you so you can focus on what you actually want to work on, not on filling out paperwork or or calling around. And if your first match isn't the right fit, you can switch therapists at any time. BetterHelp makes it easy to get matched online with a qualified therapist. Sign up and get 10% off at betterhelp.com 404Media that's B-E-T-T-E-R-H-E-L-P.com 404Media.
[28:00]
A
Yeah, I should mention I don't have the article in front of me, but I did write a few months ago about the foundation noting that the different chatbots appear to be sending less referral traffic to Wikipedia directly. And that is a concern because as you mentioned, that is kind of reducing the pool of potential editors that we need in order to keep the fire burning. Right, to keep the flame alive. And it does mention in the post that the foundation is thinking about and talking to the AI companies about how we might be able to improve that situation. I'm wondering generally, do you think OpenAI, anthropic, Google, are they able to be good stewards or good partners in this project?
[28:56]
B
So I think that we are partners already. And as far as are they good partners, I think that every single person goes to work every day and tries to do a good job. That's what I fundamentally believe. Right. And I think that, you know, the incentives in the current system are to take as much as possible as fast as possible. So that's hard, that's a hard dynamic to be operating within. And it is, I think the greatest challenge that I face every day like in, in this environment. And you know, the reason why we created so there's like a project called Enterprise that we run and what it is is it's a set of APIs that we're asking commercial users to use instead so that they can support our infrastructure. As I mentioned at the beginning, we have our own data centers, right. And, and we don't have the capacity of AWS to just dynamically respond to, you know, hundreds of thousands of bots like attacking our infrastructure effectively. You know, the scraping behavior starts to just look like a denial of service attack in the end. And many have responded well. And I think that the project is going well in partnering with them on that front. And there are incentives are aligned. You know, they want fast, reliable access to the content and we can provide a much better service level agreement, you know, with, you know, service level to them if they partner with us in that way. And we do find sometimes that when we try to reach out to companies, they don't respond to us. And this goes back to something from the early Internet that people would create bots that scraped websites and they didn't necessarily update the contact information or they put bogus contact information in there, not even maliciously, but just because somebody like, you know, took some code off the Internet and copied it and then like let it rip, you know, so, so there's, there's a little bit of that and people are moving like really fast. So sometimes I think that's, that's the issue. But when we reach out to these bot developers. Yeah, sometimes they don't respond and then we just have to block them. And that's, that's been a challenge, I think. Another thing that. I don't know if you saw the recent post from Brian Krebs about residential proxies.
[31:40]
A
I have not.
[31:41]
B
Oh, this is a good one. You're going to have a good time. So you should go read this. It's talking about the rise of residential proxies to. Sometimes they're being used for denial of service attacks, but also increasingly there are third party commercial resellers using them to do things like scrape the Internet, so.
[32:09]
A
To make the traffic seem authentic.
[32:11]
B
Yes.
[32:12]
A
Yeah, okay, got it.
[32:13]
B
Yeah. And I. So anyway, so this is not really like a thing that I would imagine, you know, marquee brand companies are doing, but enough companies of some kind are doing it that it's a problem for us. And it's not just a problem for us, it's a problem for many organizations that are trying to understand the difference between authentic and human, inauthentic and human traffic. So I would just say the partnership that we need, first of all on enterprise for sure. And then looking forward, the partnership that we want is how do we encourage people to contribute to the commons for the common good? Because I think these kinds of knowledge creation projects, and Wikipedia is only one, there's like OpenStreetMap, there's the creative Commons archives, there's many, many of these kinds of projects that have small and large data sets and they're all amazing. So how do we keep that alive on the Internet? So one of the things, like looking back to the very beginning, Wikipedia, I think is delivering on the promise, the early promise of the original Internet and how people conceived of the web. It's like a place where people are coming to collaborate, they're producing like an amazing thing together and it's been durable for a long time. And I think we are in a moment where that's at risk if we don't find ways to sustainably support people contributing to the commons.
[33:47]
A
Can you talk a bit more about how Wikipedia is currently using machine learning?
[33:54]
B
So machine learning has been part of the work, not just as a foundation, but volunteers themselves have run machine learning algorithms to correct errors, to look at different kinds of data. Like, for example, can we predict whether or not an edit is good or bad? That's such an interesting problem. And we have a model for that called the revert risk score. So is it likely, if a human being sees that edit, are they going to revert it or let it remain. Those are some ways that have existed for a very long time, more than 10 years. And we've continued to improve those models ourselves and with volunteers over time. More recently, we've been looking at ways that as an editor is doing their work in the same way that you might experience a word processing editor today, where you'll get a little squiggle line if you misspell something or if the grammar doesn't seem quite right. We have different things specific to Wikipedia that we can prompt people with. Like, did you include a citation? That's a really good one. Another one is there's a funny phrase for this in English, Wikipedia, it's called peacock language. So it's like something that's overly complementary or positive. So we don't like to do that because it's not neutral. And so we'll have a little prompt that looks at the language and tries to give feedback to the user as they are typing. Hey, this looks like a tone that's not encyclopedic. Would you like some suggestions or could you revise that sentence? So we're doing things like that that really help the editing process and help a user that's editing in real time be more successful in their edits. And I think in general that's how we think about the ways that we can use machine learning now and into the future. How can we make an individual editor that much more effective in a moment, make them more powerful? Because we have quite a few editors. My hope and my dream and the work that I'm doing right now, it's to increase that number. But what if they decline? Then it's most important that we make each editor as productive as they possibly can be.
[36:21]
A
Yeah, that's great. It's important for me that you lay that out because something that we struggle with as a publication that is very critical of AI, and I think our readership is very critical of AI. Machine learning, very broad term, has many applications that people might not realize that they use and enjoy and rely on. So just want to make that clear before we kind of move further into some AI conversation. You mentioned wiki projects where are like these like working groups within the editor community that come together for specific person purposes. I wrote about one last year or two years ago now called WikiProject, AI cleanup. And basically what they do is they go around Wikipedia and they try to detect, confirm and then remove AI generated articles, which I think there are many reasons why you might want to remove them. But the obvious one is that they have incorrect information, like completely fabricated, not factual information. And that is a huge problem across the web. That's something we cover a lot and is obviously something that is very important for Wikipedia's integrity. But this is a community. Again, this is an editor led effort. I am wondering, is a foundation able to do anything about it? Does it want to do anything about it? Is there ways that you're thinking about supporting this wiki project or doing your own things separately?
[37:55]
B
I definitely don't think of it as doing our own things separately in general. That doesn't work in my line of work and at the foundation we always have to be collaborating with volunteers on nearly everything that we do, to be honest. But I really love that coverage. I love that you talked to the volunteers directly. I think that they always have such.
[38:26]
A
A.
[38:28]
B
Just clarity as professional writers. Basically they're volunteers, but their writing is incredible and they spend a lot of time, you know, dealing with the front lines of this phenomenon. Right? Like they, they know exactly what is happening and they're, and they're dealing with it in real time and at scale. And they, they mentioned some of this, you know, like that there's, you know, automatic detection tools, you know, that you can try to use. It's like varying effectiveness. You know, it's like use AI to fight AI. I don't know. You know, some of it's good. I think some of the tools that they have produced like, and when I say tools, I mean just like documentation. They've created like an incredible compendium of like how to identify AI generated text. And it's the most authoritative thing that I've ever seen. You know, it's very long and it's part of the AI cleanup text. And you know, I refer to it, we, we can look at that for hints about automation that can be created. Because of the large, you know, volunteer community, those volunteers have created a number of tools. Like there was one more than six months ago that was mentioned by one of the founders of Wikimedia Deutschland. And he and I talked a bit about it and we kind of did a mini study as a result of his presentation. But he looked into ISBNs, like fake ISBNs that were showing up and its prevalence. And he started with German Wikipedia, but then kind of expanded his study to a couple other Wikipedias.
[40:04]
A
I'm just going to jump in. ISBNs are like the ID numbers for books, publications, right?
[40:08]
B
Yes, yeah, yeah, yeah, sorry, yeah, my library bias coming into play there. But that was like a super interesting one. Right. So I Guess just to kind of go to your main question, like, what are we doing about it? I think from an editorial perspective, some of the different Wikipedias, so there's all these different language editions of Wikipedias, they have different rules and some of them have actually done work to generate a large number of articles using generative AI tooling. And then what the question to the community is whether those articles are high quality enough to keep them. And for some of the languages they're like, yes, it's high quality enough to keep it, but on English Wikipedia, that has just not really been the case. And so for them, they have a different set of rules. And my role is to support these different communities and the choices that they're making, and then to help them with things like tooling that tells them, gives them an indication of how high quality articles are or not. So that's, I think, the main way that I can help. And if they ask me, they're like, can you just block some particular large group of users? Or something like that. That's honestly a thing that they largely control as well. But I can help with that. But that's not what they've asked for. What they've asked for are things like, can you help us detect this better? Can you give us tooling that helps us find these particular things that we're noticing that are problematic? And I think that's a pretty rational and grounded way to approach the problem. And I think that human beings here have done a pretty good job. But I would not say that it's 100% perfect. But that is true of the entirety of this project. It is not designed to be perfect from the beginning. What it is designed to be is something that can become better over time with a lot of attention from a lot of humans. And that's my job, is to keep making it an enjoyable place to do that. So I would say what would tip me into even greater action, first of all, requests from the community to do more stuff. I'm always happy to do that, and as responsive as I can be to that. But if increasingly people feel extremely demoralized by it, that is, that's again, something that I am obligated to act on as a steward of the mission and whatever my role is in it.
[42:48]
A
I did notice a couple of instances of friction or divergence between editors and the foundation regarding AI in the last 12 months. Things that I think in the overall scheme of things are minor but are big deal to editors. One of them is that Wikipedia was going to test do a limited Test of AI summaries at the top of articles. Editors reacted very negatively to that. And then an even more minor incident is Jimmy Wales, which is a co founder of Wikipedia, wrote about, made a point that I think is not that different than your point, which is, hey, some machine learning tools might be useful in the future and we should be aware of that. And maybe thinking about that, people really reacted negatively to that. I'm just wondering, do you think that there's actually a difference of opinion between the foundation and the editor community, or is it like a difference in attitude about AI and its inclusion or exclusion from the process?
[44:06]
B
I think that when we talk about Wikipedia and this kind of comes back to the very beginning, what you were saying about understanding what is this thing, what is this? It's such an important place to start. And the people that I talk to, you know, the individuals that I talk to that work on Wikimedia projects, they do not see themselves as one thing. You know, they do not see themselves as like a community, you know, it's a collective in a sense. But there are many communities, many individuals. And my experience, you know, having worked in this area for the past, you know, three and a half or so years, and being part of open source communities for decades, most of these folks, they really want to be treated as individuals. There are times when they express opinions as groups like through, like RFCs, you know, for example, or there's been open letters that groups of people have produce that I think represent some version of a consensus in a moment. But I wouldn't say that there's just one perspective and I wouldn't say that the foundation itself, which is made up of a lot of individuals too, that there's just one opinion about how to use AI tooling or exactly what to do, I mean, just a minor difference of opinion is whether we should, you know, as, as a matter of policy, do Vibe coded prototypes. Should we do that or not? Right. Not everybody feels the same way. And I. And is that stopping people from creating Vibe coded prototypes? No, you know, it's tolerated. But there are some people that are like, I don't know, this might be kind of leading us down a bad path, you know, and people have that, that conversation. I think that's true like in a lot of organizations, not just ours. So just like to speak specifically to the summaries problem, it was just a mistake to produce it and introduce the idea the way that we did. And I do think that Jimmy's point about the way that the world is changing the way that the Internet has already changed the way that people are changing. It's coming whether we want it to or not. And like our ability to understand and use tools, that's like a basic human condition thing. Like we have to be able to understand and use tools. So demonizing a particular tool, you know, I, I don't know. You know, and it's, it's like a, it's like a hard. I'll just say, like, I don't think that that's super helpful. Like, I don't think that, for example, I should tell my staff you can't use AI tools. And what does that even mean? Right? Like, it's like, once you get like, to the bottom of it, it's like really a tough line to know how.
[47:13]
A
To complete an email.
[47:14]
B
I know. Well, yeah, and, and it's like, it's. And we have that and it's, that's like a trivial example, you know, but, but there, there are more concrete, like, should I tell all the engineers that they can't ever fib code anything thing? I don't think I should. And so far my answer to that has been no. I think that people should try to use these tools. They're complicated, they don't always work. And I think that experts find a lot of value in these things. It can really make your life easier when you know exactly what to ask in exactly the right way. And they're increasingly becoming better at asking them sort of like, you know, more vibey things and doing like a pretty good job, you know, I think that that is just all true. So I just don't believe that we should just like, ban this particular technology. But I do hear the moral issue there. I understand it. It's not that I don't understand it. And for the moment, I think our place, you know, it's like that Coming back to like what Wikipedia itself is, you know, being neutral, I think I have to remain neutral on this point, regardless of specific moral qualms that I have with particular tools or particular things that are happening with particular tools. And that is just part of my job. I have to support this project. It's part of my job to support the mission, which is to get as much knowledge as possible to as many people for as long as I possibly can. Can. And that to me means that I have to be practical. I'm not going to be, you know, dumb about it, but I do need to be practical. And that, yeah, I don't know, that was kind of a little bit rambly, but I think It's a complex situation. And I think everybody sees that it's a complex situation. And so I just don't want to be like, glib about, yes, this, no, that. Like, I don't, you know, I'm not always like the best person to say, and I do listen to the editors and like, what they, you know, what they believe and what they're comfortable with. So, yeah, so we've been really responsive to that. And that is part of why we were thinking about the ways to make editors as productive as they can be and to introduce them to important rules that are crucial, you know, to the voice and the tone and the quality of Wikipedia is today. And that has been really uncontroversial. When we introduce the tools in the right way and when we test them really well before we deploy.
[49:52]
A
I think that's a great answer. I swear. I'm gonna let you go in a minute.
[49:56]
B
No, it's okay.
[49:59]
A
I think this will be very Unwikipedia like, but lightning round. Yes or no questions. Do you think Wikipedia will be around in 100 years?
[50:10]
B
Yes.
[50:11]
A
Do you think Wikipedia's article generation process will always have a human in the loop?
[50:17]
B
Yes, but.
[50:19]
A
Okay, I'll take it. I'll take it. And then I guess a lot of what we've talked about is essentially the governance model and how successful it has been been. And something that I think has emerged out of our AI reporting over the last few years is that Wikipedia's importance has only increased given the generative AI age, and its governance model has only emerged as more sustainable, as you've said, given the rise of, of social media. And something that I often wonder is how come more of the Internet doesn't have a similar governance model? Because it is, it seems to be so strong and helpful and productive. Do you think that in the future, like let's say Wikipedia is still around 100 years, as you said? Does more of the Internet look like Wikipedia in the future?
[51:21]
B
I hope so. I really hope so. I to your point that you said at the beginning that many people don't really understand how it works. I think it's so precious and magical how it has worked that there wasn't a lot of motivation to really understand it. But I think now is a moment to really understand. I appreciate that Jimmy wrote a book about his process. I think more can be done. There's other people that have written books about how Wikipedia works and how it functions. And I think more of that to help people understand it so that they can replicate it. It won't be exactly the same because this is very much specific to, I would say, encyclopedic content and what it takes to write encyclopedia articles, which is like a really weird, funny way to get to egalitarian governance system. I mean, it's very, very funny when you think about the whole thing. And that's another thing that I think is crucial to creating systems like this. You have to have a sense of humor and be okay with things like not going the way that you planned and to meet people where they're at and to assume good faith. That's, I think, the ingredients to what has emerged. But understanding it more, that's the way, that is the way that we will produce more systems like this. And I absolutely think it's possible other systems like this exist just like not maybe at the same scale. So I do think that there's hope for that. And yeah, I'm here. I'm here to help people that want to do that because it's really worthwhile, even if it's hard.
[53:01]
A
Okay, Selena, thank you so much for your time.
[53:03]
B
Thank you. This is great.
[53:08]
A
As a reminder for Florida, four four Media is journalist founded and supported by subscribers. If you wish to subscribe to four four Media and directly support our work, please go to four four media Co. You'll get unlimited access to our articles and an ad free version of this podcast. You'll also get to listen to the subscribers only section where we talk about a bonus story each week. This podcast is made in partnership with Kaleidoscope. Another way to say support us is by leaving a five star rating and review for the podcast. That stuff really helps. This has been four four media. We'll see you again next.