
Loading summary
A
The work that you guys do has had such a tremendous impact on the way the world works. I want to start with just giving people a brief understanding of what is Community Notes.
B
Someone on X can see a post if they think it's misleading. They can propose a note that they think other people might find informative. Other people can then rate that note.
C
We actually look for agreement from people who have disagreed in the past. And what we see is when people actually have that sort of surprising agreement. That's what makes the notes so neutral and accurate and well written, really.
A
Overall, there's many people that are very polarized. How do you deal with people that are like super anti Vaxx, Super Jam.
B
6 One philosophical thing that's important is that we want all of humanity to participate. And sometimes people are surprised by that. We have all of humanity. We then have the data to understand what notes will be helpful to actual humanity. Every post is eligible for notes. We shouldn't exempt Elon. We shouldn't exempt government figures. We should like everyone. Even advertisers can get notes.
C
There have been external studies, you know, run by people totally independent of us, who have found that if you take a post with or without a community note, that actually people's agreement with the core claims in the post does change if they see it with the note versus without.
A
Is there anything else along the lines of just working for Elon within an org Elon runs that may surprise people?
B
If I were to start a company in a company, it would be even leaner than I would have made it before. I've been amazed with just how much the team is able to accomplish with a small group, and I think because of a small group.
A
Today my guests are Keith Coleman, product lead for Community Notes, and Jay Baxter, founding ML engineer and researcher for Community Notes. This conversation may be my newest favorite podcast episode so far. Community Notes is one of the most impactful and clever and also underappreciated products in the world right now. If you ever use X Twitter and you see a note underneath a tweet correcting the misinformation in that tweet, that is Community Notes, I've never heard a deep dive into the story behind the product and the team that built it, and I'm excited to bring you just that. We get into the surprising origin story of the product, how the algorithm actually works, how the algorithm emerged out of an internal contest within Twitter, the principles behind Community Notes, and why staying true to them has been so key to its success. Also, how it survived four different leaders, including Elon and Jackson and why it's now a big part of the solution to solving misinformation on the Internet, including recently being adopted by Meta as their main fact checking tool. This is an incredibly special episode and I'm so excited to bring it to you. If you enjoy this podcast, don't forget to subscribe and follow it in your favorite podcasting app or YouTube. Also, if you become a subscriber of my newsletter, you now get a year free of Notion and Superhuman and Granola and Linear and Perplexity Pro. Check that out@lenny's newsletter.com with that I bring you Keith Coleman and Jay Baxter. This episode is brought to you by work OS. If you're building a SaaS app, at some point your customers will start asking for enterprise features like SAML authentication and skim provisioning. That's where Work OS comes in, making it fast and painless to add enterprise features to your app. Their APIs are easy to understand so that you can ship quickly and get back to building other features. Today, hundreds of companies are already powered by WorkOS, including ones you probably know like Vercel, Webflow and Loom WorkOS. Also recently acquired Warrant, the fine grain authorization service. Warrant's product is based on a groundbreaking authorization system called Zanzibar, which was originally designed for Google to power Google Docs and YouTube. This enables fast authorization checks at enormous scale while maintaining a flexible model that can be adapted to to even the most complex use cases. If you're currently looking to build role based access control or other enterprise features like single sign on SCIM or User Management, you should consider Work OS. It's a drop in replacement for Auth0 and supports up to 1 million monthly active users for free. Check it out at workos.com to learn more. That's workos.com this episode is brought to you by Productboard, the leading product management platform for the Enterprise. For over 10 years, ProductBoard has helped customer centric organizations like Zoom, Salesforce and Autodesk build the right products faster. And as an end to end platform, Productboard seamlessly supports all stages of the product development lifecycle from gathering customer insights to planning a roadmap to aligning stakeholders to earning customer buy in all with a single source of truth. And now product leaders can get even more visibility into customer needs with Productboard Pulse, a new voice of customer solution built in intelligence helps you analyze trends across all of your feedback and then dive deeper by asking AI your follow up questions. See how Productboard can help your team deliver higher impact products that solve real customer needs and advance your business goals. For a special offer and free 15 day trial, visit productboard.com Lenny that's productboard.com Lenny Keith and Jay, thank you so much for being here. Welcome to the podcast.
B
It's great to be here. Thanks.
C
Thanks for having us on.
A
It's so my pleasure. I'm so thrilled to be having this conversation. The work that you guys do has had such a tremendous impact on the way the world works. So many product teams are always talking about driving impact. I want to drive impact. Like you guys have actually built things that have changed the world in meaningful ways and continue to do that. And I've never really heard the backstory of how community does came to be and how it works and all these things. So I'm really appreciative of you guys making time to chat.
B
Yeah. First, you know, thanks for saying that. That's why we built this thing is to help people. And it's great to hear it. It's great to see people enjoying it and finding it useful.
A
I want to start with just giving people a brief understanding of what is Community Notes. I think a lot of people may have kind of heard about it, kind of maybe see it on X as they scroll through, they see these notes, but they're like, I don't actually know what this is. So can you just kind of briefly describe what is Community Notes?
B
Community Notes is a way for the people, like the public, to add context to posts that might be misleading. The basic way it works is that someone on X can see a post. If they think it's misleading, they can propose a note that they think other people might find informative. Other people can then rate that note. And if the note is found helpful by people who normally disagree with each other, indicating that it's probably accurate, it's probably really neutrally worded, it's probably informative. Then it will show to everyone on X. And the goal is just to get people more information about what they're seeing so they can make better decisions in their lives.
A
Amazing. And I think like hearing this, it's like absurd that this works. I think when people originally heard this idea, like, no way this is going to work. And so just to dive a little bit deeper, can you give us a sense, a deeper understanding of how it actually works? Because I think it's the algorithm that you guys designed that is so clever that allowed this to work. So talk a little bit about that algorithm.
C
Yeah. So I think a key misunderstanding a lot of people have, if they haven't really dived into details is they kind of just think that maybe someone can write a note and it appears immediately or we're just taking a majority rules vote of who thinks the note's good. I think both of those approaches would probably lead to biased or inaccurate notes. I think the key thing really that we do is we actually look for agreement from people who have disagreed in the past. And what we see is when people actually have that sort of surprising agreement, that's what makes the notes so neutral and accurate and well written really overall is just that people who are very polarized overall often can't find agreement when things aren't accurate. Right. I think it also provides some good anti manipulation properties. I think people are often, if you said, I think back in 2020, before we started building anything here, whether this could work at all, I think a room of ML engineers would say, oh, you have to keep it closed source. You know, people are going to be manipulating this all the time. You have to use ground truth labels from fact checkers. There's no way that you could like bootstrap the system without external labels. But it turns out that you can do that with this kind of bridging based agreement algorithm is what we call it.
A
Okay, so just to summarize and make it super clear, it's basically people. Someone writes a note, this information is false. What's like a good example, just as we talk about this, like a classic.
B
Example, really, really classic example is an AI generated image or an out of context image, like look what's happening here. But it's actually from like five years ago in a different country on a different topic.
A
Oh man, I've seen those so many times where it's like, look what's happening in San Francisco. And like, no, this is a whole different city. And that's not totally.
B
Yeah, yeah.
A
Okay, okay. So someone posts this AI image, someone writes a note, and this is actually five years ago in a different city. And this algorithm helps understand if this is a real, if this is true, this note is true and it's just people, regular people doing this.
C
Yep, yep. Regular people who have signed up to be community notes contributors. So you know, there are a few checks. Like you do have to have a verified phone number for instance. But yeah, at the end of the day these are regular people, not necessarily professional fact checkers or anything like that.
B
And you know, that was like, that was really important to us too. Like there was a question at the beginning to the point Jay was making of like, what did, did Anyone think this was going to work? Obviously it was kind of a crazy idea. We didn't know if regular people were going to be able to do this task. And certainly, you know, people had concerns about whether they would do it, do it effectively. Initially, some people inside the company were suggesting like, hey, why don't you have journalists or, you know, some, some select group be the first participants? But very specifically, we're like, no, that's like. We're trying to move away from the idea of curated editorial decisions being made around this. This is supposed to be open to everyone. So there's very. We very intentionally try to allow all humans in there. People are randomly selected and that's important to it. You know, feeling fair, feeling open, feeling trustable.
A
Yeah. And again, it's just like this sounds like the holy grail of understanding what is true. And it actually works and works so well that Meta recently, as you all know, decided to adopt this exact system for them instead of having tens of thousands of fact checkers reviewing things.
C
One distinction that I would make, which maybe can come off as nitpicky, but I think is important, is community notes adds additional context. It's not fact checking necessarily. Right. So there are cases where the post could be true, but maybe it's just misleading because there's no context, or the missing context. And you know, we cover those cases. And I think that's, that's kind of an important distinction. We also, we just have the philosophy that users should be able to make up their own minds. Right. Like here, here's the, here's extra context. Take it or leave it. Right?
A
Yeah. What I think about it, you shared this with me, this example of a, a picture with a, with a, with a cat. And somebody's community note was just, that's a dog. Or is it the other way around? Or that's a cat.
C
Yeah. Palestinian boy shares his bread with a dog was the post. And it's a picture of this cat.
A
Right.
C
So obviously this particular note is not super necessary because it just says that's a cat and links to Wikipedia for cat. It's kind of a good example that the system is not something a professional fact check or whatever or think would need fact checking, but it's proof that the system is really run by the users at the end of the day and adds some comic relief, I guess. And, you know, the note is correct.
A
And I could, you know, it's important. When does a post get triggered to even be considered for community note? Is there like a threshold or is it just you can write a community on anything and people decide what they want, vote on. How's that work?
B
So every post is eligible for notes. And that was again, another really important principle. It's like we shouldn't exempt Elon, we shouldn't exempt government figures. We should like everyone, even advertisers can get notes. So any posts on the platform can get a note. And if you look in practice, you'll see notes appearing on world leaders, on Elon, on ads, on media organizations, and on obviously like just regular people using social media. But yeah, the idea is really that it's an even playing field. For a note to be proposed, the person proposing has to have earned the ability to write notes. So there is that aspect where you have to earn in to be able to do this. And the way you earn that ability is through your ratings by demonstrating the ability to help identify notes that are found helpful to a broad range of people. So basically, if you have an ability to see and recognize what's helpful to a lot of people, then you have the ability to start proposing notes.
A
I actually signed up to be. What do you call these people? Note takers.
C
Contributors.
A
Okay, contributors. Yes. I've been rating. I haven't achieved nice. Can write notes yet.
B
Yeah, it's not super easy. It takes some effort.
A
Are there stats you can share about the scale of community notes at this point? Especially things that might surprise people?
B
Yeah, I mean, the service is growing rapidly. So there are hundreds of notes per day. And to put that into context, I saw some stats recently from someone at UC Berkeley saying there were something like 10 fact checks, traditional fact checks a day. So in contrast, there's hundreds of notes a day that are getting shown. They span a huge range of topics from obviously politics, news out to entertainment, sports, gaming, just whatever's going on that day. In addition to there being hundreds of these individual notes, they can also be matched to multiple posts. So if someone writes a note on an image or a video, like let's say it's AI generated or something like that, that note will automatically be matched to all posts that contain the same image. So you can have a single note matching to thousands of posts. And over, let's say the last year, 2024, we had something like 95,000 notes that were seen about 30 billion times. That's more than double the prior year. Prior year was something like 37k notes seen 14 billion times. So that rate is increasing dramatically. We think about like 30 billion views. That's a lot of information that is getting out there. That Might not have been out there otherwise, which is pretty cool. And part of the reason it is expanding like that is the contributor base is expanding. There's something like 950,000 contributors around the world. That's nearing a million people making this happen, which is amazing.
A
And I'm one of those, right? Like I count as a contributor.
B
Yeah. Okay, you're signed up as a contributor.
A
Okay.
C
Then there's more people on the waitlist too, so there's plenty of headroom for more growth. Regarding the matching on media and URLs, I think that's a huge way to get extra coverage. Also, I do think we've been very careful to make sure that those matches are precise because I think one thing that people love about community notes compared to other types of fact checking is that actually the notes are custom written for the particular claim you're seeing. Right. So so often a fact check warning would just say something like, you know, get the facts here. And then there's a link to some generic page about voting like information, which is, you know, so, so not helpful to have the, the information behind a click. So, so pulling the context up, you know, so that you have zero clicks that you need to make and keeping it specific is so important.
A
One feature I love that I imagine you guys thought deeply about is if I like the post in the past, I get notified later if a community note shows up so that I'm not like remembering this false information.
B
Yeah, I mean we, we try to make notes as fast as they as we can, so we want them to appear instantly if possible. But inevitably there's going to be a time gap between when a post goes live and when people figure out what's going on and when they get the note out there. And so we send those notifications to try to close that gap. And yeah, we get, we, we get a lot of love for that. We see people take screenshots and share them. They're excited about it. And it's also a pretty cool example of something you can do on the Internet in the social media world. That was difficult in kind of like a print or standard news world where you would see maybe a correction like the next day in a corner of a paper, but it was hard to read. Here you're getting a ping about it if you've engaged with the post and note shows up.
A
One user feedback point is, I'd love the push to just tell me, here's what you got wrong, because I find that I actually have to go into it and read it and I feel like the push could just be like, here's information, here's more context to this thing. You like, we'll go take a look at that live user feedback.
B
Nice.
A
Okay, I want to get into the origin story of this whole thing, but two more questions because we're on this thread. One is what's the kind of the threshold for a node to show up on a node? Is that information you can share? Just how does that work?
C
So just because of the details of the way the algorithm works, it uses this machine learning algorithm called matrix factorization, where we fit it with gradient descent and whatnot. The threshold is 0.4 on this made up scale, 0.43. In practice, what it means is, you know, basically a majority of people, if there is a polarized divide relevant to the notes, you know, obviously some notes are not about politics or something polarizing, but if there is, then a majority, a sizable majority of people on both sides would generally need to find the note helpful. And then there are these, there are other rules that come into play beyond that main one. So, you know, even if it's above that threshold, it might get filtered out if there's a separate algorithm that's looking at agreement between people's incorrect tags. So like maybe, maybe people found the note helpful but incorrect. Right. Like it happens. And in those cases it doesn't matter if it's above the helpfulness threshold.
A
So is this 0.4? Is this probably the wrong way to think about it, but is it 40% of people that normally disagree agree.
C
Okay, it means nothing like that. It's just like on some arbitrary scale.
A
Okay, yeah, yeah.
B
If we changed random other things about the algorithm, that number would also have to change to an equally seemingly arbitrary number. We arrived at some numbers like that by gauging user feedback so we could share a lot of notes with people, get feedback on which ones are helpful. And there was sort of just a line emerged about indicating where things go from questionable to pretty clearly helpful.
C
Yeah. And it is set right now, by the way, to be really conservative. I think we, we just are pretty particular about quality and we really want no quality to be really high. I think, I think Keith and I both believe that we live or die based on the quality of the notes at the end of the day. So. So we'd rather not show a note that may be good, but we didn't have enough signal on than the other way around.
A
That makes so much sense. Like I've never seen a community note that is wrong and breaking that promise is A big deal. So I completely get why you guys are super conservative there. Okay, two more questions along these line. Cause I'm just curious. These weren't on my list of questions asked, but I feel like people wonder this. How many notes are written versus end up showing up and triggering on it?
B
We probably show about 8% of notes that get proposed. I think that's. It's been between. It's like 7% and 10% or 11%, something like that. Over time, the number can vary a little bit. And as Jay said, there are undoubtedly, and you can see it, there's clearly more good notes than we show. But the goal is to hold a really high bar. We want to show a note when it's going to be helpful, when it's not going to appear biased and undermine trust in the system. We want these to be neutral, informative, helpful. And as Jay was saying, we view the worst possible mistake as showing a bad note because that's going to undermine trust. And the trust is, is. Is why people like the product. So, so yeah, we, the bar is there and you know, like I said, there's. There's clearly some, some in that remaining. Let's call it 90% that are good. And then there's a lot that are just like not that great. And there's some that are bad. And if you write one of these ones that are bad, which bad being defined as people who normally disagree find the note not helpful. So it's like the inverse of the ones we show. If you write one that people, you know, people normally disagree find not helpful, you actually will ultimately lose your ability to write and have to earn it back. So that that range. The other 90% is a mix. Sometimes people look at the number, they're like, oh, why don't you show more? It's like, well, you probably actually don't really want us showing most of those. It's the, the, the gold here is that the system is able to filter out the good ones.
A
That makes sense. Okay, one other question is there's many people that are very polarized, like very disagreeable with a lot of things. How do they filter into this algorithm? How do you deal with people that are like super anti Vax, Super Jan 6, like all these very extreme potential views.
C
People really are so polarized that there, there isn't agreement among people who typically disagree. You know, it's possible that this is one of those notes that might be correct but. But just wouldn't be useful content. It wouldn't be, you know, helpful to show as context.
B
Maybe.
C
Maybe it's about a claim that people have, you know, really entrenched opinions about and they've read hundreds of things about it already. Or. Right, like probably, probably this is just not going to improve people's understanding. It's just not going to be a helpful user experience. So it might not be the worst thing in those cases to not show the note. People a few years ago were pretty pessimistic that maybe fact checking never changes people's, you know, understandings about what's true. Actually, there have been external studies, you know, run by people totally independent of us, who have found that if you take a community note or a post with or without a community note, that actually people's agreement with the core claims in the post does change if they see it with the note versus without. So we are having an impact on this thing that people previously thought was maybe not so easy to do. And so it's nice to focus on the cases where there is the bridging agreement. I would also say there is this reputation component to the algorithm as well. So if you consistently rate notes in a way that is counter to the bridging base consensus, then we'll stop counting your ratings. Right. So if you're the kind of person who constantly rates bad notes as helpful, we do filter you out.
A
So.
C
So there's a difference between those types of people versus just the, the good but polarized ones.
B
Yeah. I think, you know, one philosophical thing that's important is that we want all of humanity to participate. And sometimes people are surprised by that. They'll be like, oh, aren't there people who are like, you know, shouldn't be doing this? Or like, there's, you know, I don't, their, their thinking is so extreme or something, maybe they shouldn't participate. But our view is, it's actually we want to have all of humanity here because if we have all of humanity, we can, we then have the data to understand what notes will be helpful to actual humanity. You know, we can, we can better model that better, better understand and better show those notes. So it's advantageous to have people who have all sorts of points of views. And we don't expect that every note will be loved by every single person. You know, that's kind of an impossible bar. But we do intend to show the notes that like 80% of people are going to, you know, read and say, wow, I'm glad I knew that. And so, you know, in that sense, it doesn't matter how, you know, maybe extreme someone views a person's views as it's still great to have them in the program. So, you know, no matter what your views are, please sign up and participate. It helps identify what's really helpful.
A
Cool. And we'll link to people if they want to actually sign up so they know how to do this. Something we didn't actually specify. These are all volunteers. No one's getting paid to be doing these notes and voting, right?
B
Yeah, it's totally based on intrinsic motivation and we think that's a great reason to be doing it. When you talk to the most active contributors, a lot of them, they just, they want to have better information out in the world and that's a great motivation. So yeah, that's why they. And you know, if you, if you think about like for these people, the impact they can have, it's kind of nuts. So when we first launched us wide, this was like in 2022, a note appeared on a White House tweet and the White House deleted the tweet and reissued an updated statement. And like, like, imagine being the person who wrote that. You probably have like 12 followers. Your, your posts probably get, you know, a couple likes and here you just put a, put a note on the White House and they change their public talking points based on what you did. Like that is an incredible amount of impact. So you know it. You can see why people are motivated to do it when they care about what's going on in the world. It's, you know, you don't have to be a big, well known person to shape the discourse and information flow in a, in a way that's helpful.
A
It's insane. Like there's so much to love about this one is just the meritocracy of this whole operation of just anybody can, that is true and correct can participate and have impact. Also just shows you how much information we get that is just wrong. Like we had no idea how often we see things that are wrong and now we do.
B
Working on this product has made me realize just how many things I used to trust kind of by default that now I look at more skeptically.
A
Definitely a meme these days. Okay, before we get to the origin story, is there anything else along these lines you guys think might be really important to share? Really, really interesting.
B
Sure.
C
I guess one other thing just is that although we don't actually use the fact that a post was noted in the core ranking algorithm, which, you know, we, we think is a nice property, there is a really big impact just organically meaning not from the algorithm, but Just from user behavior where people will like and reshare or, you know, quote posts way less when. When notes are applied. So just. I don't know. For, for people out there who typically run a B tests on big, you know, platforms, you may already be familiar with this, but like 1% is typically an awesome effect size for any sort of algorithm change. We saw more like 30 to 40% engagement rate drops for likes and reposts. And an a B test we were in when comparing showing a post to, with or without a note, which is just crazy big. And then. And then if you actually look, that's. That's just an a B test on the engagement rate. So that's not the network effect. If you capture the. The overall network effect of how post, you know, is spread less by that person's repost. Basically if you look top line with a difference in differences approach. Different. Multiple different external research groups have both found consistently that there's like a 50 or 60% drop in total reposts, which is just nuts after a node is applied. So it's having a really big impact on spread actually too.
A
That's so like, that's so great to hear. It's what I would want to see. And it's incredible impact, basically. Like a AI image of something false would just go crazy on Twitter and did before community notes came out. And now what you're saying is just adding that context, not actually like al. You're saying the algorithm doesn't demote it if there's something incorrect. It's just people are like, okay, this is false. Why would I want to retweet this? That makes sense.
B
Yeah, the. The notes just totally take the wind out of these stories. So like the thing will be going viral. Note appears, resharing drops 50 to 60% and like that's it. Like it just. You can. At 50 to 60% per generation. This. The virality quickly goes to zero.
C
And by the way, there's. I have very mixed feelings about this next one, but authors become 80% more likely to decrease or sorry to delete their post after they get noted. Which, okay, that's great because like, less. Less misinfo out there. But I'm panned about because those are usually the best notes. Like if the note was so just good that you had nothing, you had no other option but to delete your post. Those notes don't get seen by other people. Right. Because that's hard. There's an argument, by the way, that like seeing. Just because you might see the same Misleading claim elsewhere off X or somewhere else on X. You know, it might be good to actually show. Better to have seen the post with the note than not see it at all.
B
Yeah.
C
Unsure about that coin.
A
That is so interesting.
B
Yeah.
A
Yeah, I could. I'd be so sad if I was that community note writer and just. Oh, man, it's so good. They just can't even keep the post up. Okay, so coming back from today's world where you're. This, like, small amount of code is changing the way people understand the world and what they believe and making the White House rescind their announcements. Zooming back to the beginning of how this whole project started. What I heard, just briefly, is, Keith, you were just kind of tired of managing PMs. You wanted to just work on something yourself. You wanted to work on something impactful, away from corporate bs. And you basically just started looking for something that was impactful, important, and you found this talk about just how it all came to be at beginnings of the story.
B
Yeah. So, I mean, for me, the beginnings actually go back to why I joined was then Twitter in 2016. I was. I had a startup and we were. We'd had some acquisition offers, and one of them was from company Twitter. And it was 2016, it was the middle of the election between Donald Trump and Hillary Clinton. And there were like something like three televised debates, but every day there was a debate happening on Twitter. And it was very clear, like, this is where people are talking about these things that matter, where information is being shared, where ideas are being formed. And as a user, it was obvious that I could get good information there, but it was also obvious that there was questionable information floating around. And I remember just looking as an outsider thinking, wow, this is a really hard problem, and it also seems really important. So we ended up going to Twitter and the company was in a turnaround at that point. So, like, my first three years was just helping to get the company growing again. You know, working on everything that was the consumer product, you know, getting user growth going back and people wanting to work there again, et cetera. But a few years in, I was reflecting on what we had done. You know, I think we had done a lot of good work getting momentum going, but it. And people in the US and in the industry had tried things to kind of deal with. With misleading information, but nothing was really working. Like, it was obvious nothing was working. Nothing could handle the scale of the problem, nothing could handle the speed. And a lot of people just didn't trust the existing approaches. The existing approaches were either fact checkers or internal trust and safety teams making decisions about what was or was not misleading. And like, a lot of people just didn't want or trust that to be the way this was decided, which is very reasonable. And so, you know, I'm looking at that. I was, I was still managing a large PM team. You know, that's a whole story in itself. I. I felt like I would. That job required a lot of energy in. And I. And I didn't feel like I always saw the output that I wanted to see from it. Like, I didn't see the change in the product I wanted to see. And you know, I was contemplating, should I go start a company? Should I do something else? And I kept coming back to this problem. I'm like, man, like the. How is the world going to deal with the. With this information quality issue of like, what we get on social media or wherever we get it. And like, you know, I'm at, I'm at this company where you can make a difference on this problem. Like, why not go and try some crazy ideas and see if, like, one of them might work? And so I came back, I had a kid. I came back from paternity leave. I went to my boss, Kayvon. I was like, hey, Kayvon, how about I just stop doing my job and I go work on this instead? You know, this being try some crazy ideas to see if we can deal with misleading info. He was stoked. And so I went off and started working on that. You know, it started with just reading any research I could on the problem and existing solutions. What was or was not working or what were the issues, and then into prototyping. And then, you know, it ultimately led to us building and piloting this idea that became Community Notes.
A
Amazing. Okay, I have so many questions, and we're going to keep going through the story, but when you joined Twitter, what was kind of the. It was called Twitter at this point. I'm going to try to call it X now, which I know is the important to your boss. What era of Twitter was it at that point? Like, it was Kayvon run joined or. And who is the CEO? Because there's been many.
B
So, okay, yeah, I started. I came in December 2016. So Jack had relatively recently come back as CEO to turn the company around. And just to give you a sense of like, the state of the company, something like a third of employees were leaving every year. So just imagine that like a third of your team gone every year. You know, the stock was in the toilet, the product was not really growing. And so Jack was working on a turnaround and Kayvon was there already. Kayvon was running Periscope with a bunch of video stuff. And you know, that. That group continued to, you know, Jack. Jack was there up through the start of the Community Notes, then Birdwatch project, and. Yeah, okay.
A
And it was called Birdwatch. I don't think we've used that term yet, but that's an important point. It was called Birdwatch initially.
B
Yeah. So it was originally called Birdwatch when. When we started the project, but obviously, somewhat famously, the name changed along the way.
A
Yeah, maybe let's just tell that story real quick. And I know we're zooming into more, but just I have this Twitter thread that I saw between Jack and Elon when they're debating what to call it, and Elon's like, bird Watch sounds creepy. I want to change it. Is there anything there you can share?
B
Yeah, the story there is kind of funny. Elon. Elon came in, acquired the company, and we had just launched the product relatively recently. Us. It had been in pilot for a year, but we had just made it available us wide. And I guess he'd been seeing the notes, and soon after the exhibition, he DMed me and he was like, hey, this Community Notes thing is awesome. And I was like, oh, I'm glad you like it. Let's talk. And so we talked the next day and he kept referring to it as this Community Notes thing. And I was like, you know, it's interesting that you keep calling that, calling it that, because that's actually the very first thing that I called it. Like, the very first Figma mockup I made depicting this thing was called Community Notes. It just. I don't know why, it just felt really natural. And so that's what we had. That's the first prototype we had tested. You know, later the project changed, changed its name to Birdwatch. But, you know, Elon was like, hey, let's just call it that. And so the next day we just changed the name. And, you know, it's a. It was. It's always. You're notable for the team when you change your name, but really the. The team was excited about it. I think it is a much more understandable name. Jack has made fun of it, calling it like the ultimate Facebook name or.
C
Something like that, but the most boring Facebook name I've heard name.
B
Which is funny because they're now launching Community Notes, But I think it is a very understandable, intuitive name, and I think it has served the product really well. There's a reason it was the name in the very first mockup.
A
Yeah, I think descriptive names just make sense, this connection with Elon. And I want to talk later about just how you've dealt with so many strong personalities and kept this alive throughout so many changes. But before we get to that, you. You did something that I think a lot of product leaders, engine leaders, just people that have managed people dream of, give up all this power in air quotes and career trajectory and influence and just like forget all that. I'm going to go back to just building something awesome. Small team. Is there any advice there you could share for from that experience that you think might be helpful for other leaders to share or to hear to help them maybe do that same jump? Because that's really difficult in practice. Easy to talk about, hard to do.
B
Yeah, I think it is a difficult jump. I've done it a bunch of times in my career and I've always been very happy with it. Where I started with a small team that it kind of grew into something bigger. And then I was like, you know, this is like, we're kind of dealing with a lot of big production stuff, teams really big. I want to go back to doing something like crazy and new with a small team again. And so I've kind of done that like saw teeth leap a bunch of times. But it can be hard because certainly the natural, like the classic career path is sort of, I don't know, rewards or, you know, running a large organization or being a manager, things like that. But I think at the end of the day, you got to work on stuff you love, you got to be having fun and you. I think people want to be having impact and I think there's one myth that that can get in people's ways. The idea that the, the more people you manage or something, or the larger your scope is, the more impact you have. I definitely do not think that is true. If you just look at, I mean, look at Community Notes, for example. If I had stayed running a large consumer PM team, like, what would I have produced? Like 16 more pages of OKRs, like, I don't know, you know, a bunch of documents. And I think building Community Notes has had way bigger impact on the world. It's become the industry standard for how to deal with this now, which is super cool. People love it. It's the first thing that is plausibly dealing with the Internet scale issue of information quality. I think it's unquestionably a bigger impact than I would have had if I were just whatever, doing Some standard management track thing like I was doing before. And I think that's true of so many other small companies and startups. I was just reading someone screenshot it. I think It's Blake Scholz, LinkedIn the other day who went from director of coupons or something to building the first supersonic from Groupon. Yeah. And those stories are everywhere when you look. And so I definitely have found that for me. I love building hands on. I love building trying crazy new ideas. I love the zero to one experience. It's fun to scale things up too. And it can be fun to operate it, you know, at scale. But doesn't you, you know, this team is a good example of one that operates at a very large scale, but that is still very small.
A
Yeah, I think the way you guys operate is what more and more companies are trying to do. Remove middle management layers, create small teams that just execute and build impact. And just like ICs. Whenever I say IC, I have a comment on YouTube where they're like, what is IC? So I'm just going to explain. Individual contributor, non manager is when I say the word ic. So let me follow this thread. And when I asked people about how you set up the team to operate effectively and protect it initially there's this term thermal that came up a lot. It was like a thermal team, if that's how you describe it.
B
Yeah.
A
What is thermal?
B
Yeah. So anyone who's worked in a larger company probably knows that things can get kind of bureaucratic or bogged down. Decision making can be slow. Like there's these large planning cycles. People can like try to like take someone from one team, move them to another, like at random arbitrary times that can disrupt a project. Like all sorts of things like that. You know, our company, this is, you know, a number of years ago when we started this project, we had a lot of founders in the company. Like Kayvon is an example of founder who is helping to run the company. And he had this idea like, hey, why don't we create this program, called it thermal, where we could have teams that were somewhat isolated from that they could run through their own process. They would have like one clear owner, the team would be entirely dedicated to that project and we would just sort of like repeatedly make funding decisions as to whether to continue the effort.
A
And so why was it called thermal by the way? What was the idea there?
B
I think the, I think it was like an, an old bird analogy of like thermals lifting, you know, the birds on their wings. Twitter obvious. Twitter 1.0 obviously had a Lot of bird analogies, bless its heart. And so, you know, that was one of them. But the, you know, the idea, I loved the idea as someone who, you know, liked the startup environment. And so when we were starting this project, I was like, hey, Kayvon, like, why don't we make this the first thermal project? And he was like, no, yeah, let's do it. And so we started with that way of operating and it gave us from day one a lot of freedom and autonomy that I think was really important to make the product work.
A
So just be very specific about what makes it a thermal project. How do you set that up? And this is asking from perspective, if a company wants to build their own something like this, what does that look like?
B
Yeah, I think there's a bunch of key attributes. So one key attribute is there's one clear driver of the project who's effectively like founder, I guess. I mean, maybe you could have two or something, but like, it's like really clear. There's like driver of the project and also there's one clear decision maker that they go to.
A
Oh, outside of the team.
B
Outside of the team. And that was true back when we started and it is true now. Like if we need something or have a question about something. I talked to Elon and it was, you know, it was like that from the beginning. It's like that now. And I think that's a big reason we're able to make decisions effectively quickly in a simple way.
A
And it probably has to be someone very senior. Not.
B
Yes, it needs to be someone, someone senior who can make the decisions you need made, whatever they are. So I think that's really important, that clear decision making structure. Another was 100% focus. So everyone on the project is expected to be totally focused on it. That at a lot of companies it can be easy to have people's attention sort of spread across a bunch of things and it makes it hard to get stuff done. Like, you'll go to, like, you'll talk to whoever that person is, you'll ask them for help on something and they'll be like, yeah, I'll help you. I gotta finish this thing, you know, and it'll take me like a week or two and then I'll get to it. And like a week or two delay totally changes the momentum of a project. When, you know, we were 100% focused, we talk in the morning, it's like, hey, Jay, why don't we like, try this thing on the algorithm? He's like, yeah. Then, like, then, you know, that afternoon or the next day we're looking at results. And so because of that total focus, the rate of iteration goes way up. And then, you know, beyond that there was also just the ability to use whatever our own sort of like decision making process was. We didn't need to write okrs or, you know, follow others like standard practices, obviously, like we had to make sure we were responsive responsibly building the product and everything. But we didn't use need to use standard, the standard practices. And I think that's another great example. Like okrs, I understand why they can be helpful, but they can also be, you know, not necessarily the right cadence at which to set goals. Like, I don't, I think it's really unclear that quarterly or annual goals are actually like the right pace. Like we would set our goals for what? Like we would set the goal for the next milestone that mattered and we would work on that. And when we reached that milestone, we would have an idea of what was coming after. And then we, after, when we hit that, we'd set the next milestone, whether that was two weeks, a month, three months, like what, whatever it was, like we set our own pace and goals at that pace. And that just, I think is a lot more natural for the development of something.
C
The, the whole OKR determination and planning process took longer than it would take us to pick a goal and then execute on it and finish it.
A
How big was the team early on that you set up? How many engineers?
B
It started with just me and then we, when we decided to build the thing, we figured we needed about five and we wanted to be as small as we possibly could. It was clear we needed someone on ML doing scoring. It was clear we needed someone to do some client engineering work. Someone to do backend engineering work. There may have been like, you know, one or two other. Oh, we needed a designer and a researcher to help us understand the customer base and make sure we were, we were building the thing in a way that was actually going to resonate with people. And so I think that was, I think it was like backend, front end ML design research. That was the original team from what I remember.
A
Amazing. So one, basically one of each function. A question I have for Jay actually is there's all this talk of small teams and moving fast, but, you know, sometimes you just need more engineers to build the thing. Is there anything you've learned about just how to keep a team small while moving as fast as you are and not need, we need to hire more engineers. We need to hire more engineers.
C
I Think in the beginning when we were iterating, you know what, what should even the requirements be? It was definitely good to just have it, you know, like one ML engineer. But I think at some point we got clear on what the goals of the, the algorithm should really be and we try, you know we were, I think at the very beginning it wasn't clear that we needed to build this bridging based algorithm. Right. The actual first algorithm that I put into production was very focused on anti manipulation. It was this kind of page rank variant. But it didn't solve the problem of you know, bias basically. So if, if there are some, if there are more users on one side, the a PageRank type graph algorithm can actually amplify those biases. So I think you know, after building that prototype and, and getting data from that, it was clear that you know, the, the bridging based algorithm was going to be the way that we needed to solve it. And at that point basically I set up a, a Bake off basically made you know, this, this kind of like the you know, Kaggle competition or something. So that was like the key time where it was really important to pull in other engineers.
A
That is such a cool story. I want to follow that thread before we do that. You mentioned you guys yell thermal. What does that mean? Is that like yolo? Like a version of okay, we're just going to ship because we're thermal project.
C
Ship it.
A
Okay marketers, I know that you love tldrs so let me get right to the point. Wix Studio gives you everything you need to cater to any client at any scale all in one place. Here's how your workflow could scale content with dynamic pages and reusable assets effortlessly. Fast track projects with built in marketing integrations like Meta Capi, Zapier, Google Ads and more. A B Test landing pages in days, not weeks with intuitive design tools. Connect to tracking and analytics tools like Google Analytics and Semrush and capture key business events without the hassle of manual setup. Manage all your clients, social media and communications from a unified dashboard. Then create, schedule and post content across all their channels. If you're working on content rich sites, WIX Studio no code CMS lets you build and manage without touching the design. And when you're ready for more, WIX Studio grows with you. Add your own code, create custom integrations with Wix made APIs or leverage robust native business solutions. Drive real client growth with WIX Studio. Go to wix studio.com okay, so coming back to this algorithm, this is actually really Interesting, because I've never heard any of this. I was going to ask just what inspired this actual algorithm. And you basically did an internal competition amongst ML engineers to see who had the most successful algorithm. Netflix contest style, Kaggle style.
C
Yeah, yeah, I, I think so. I mean this particular idea of, of finding, you know, content that is liked by people on opposite sides of the, of a polarized divider who, who typically disagree, you know, this was not an, an idea out of thin air.
A
Right.
C
Like I think Keith had found some of Chris Bill's work he had done. He had, you know, made this list of accounts that were often liked by people who, you know, were on both sides politically. There is, you know, other, other projects like Polis out there that, that look for agreement among, you know, people who typically disagree. But I think that yeah, it wasn't obvious that our project definitely needed to use that from the very beginning. But then, you know, when you, when you implement it and compare it against these other type like PageRank seems obviously, you know, it's designed to be kind of manipulation resistant. It's naturally, naturally like if you just have a voting ring of people who all vote themselves up, then PageRank can filter that out very well. But like that just wasn't the main attack vector I guess. So we had to get some real data from the pilot to realize that okay, the real thing going on here is people are polarized. And so it was only once we got that, the real data from the pilot that I think it was, it was clear that the bridging based algorithm was the direction we really needed to go.
A
I want to come back to the way you operate the team. I hear that you run the whole team off a single Google Doc. That's like a four year old doc that you just keep adding goals to and bullet points. Is that true?
B
There is a very long running doc that has had to be chopped and purged because it was breaking Google Docs and Chrome at various points in time. It's sort of like a note taking doc. It's really where we coordinate what we're doing. The team meets on a daily basis. We spend whatever amount of time we need to get on the same page about what we're building. It can be, we might talk about anything from what's most important right now to what should we work on next to what are we trying to launch right now and why is it not launched, what's in the way of launching it. And we might review new modeling or scoring algorithm update and try to understand what's Working in it, what's not. So we'll just cover whatever we want and. Or whatever feels most important. And as you said, we set our goals very dynamically. So whatever seems like the most important thing for us to work on now and next is what we spend our time on. I think that served the project really well versus feeling attached to like some kind of quarterly goals or something. Like we'll look at like what is going to help people the most or like what's the biggest problem right now? What are either one of those and we will go tackle it and we can. We might change our roadmap, you know, multiple times in two weeks based on what we see.
A
So I'm hearing no Jira, no sauna, no Monday.com?
B
No.
A
Okay.
C
Yeah.
B
I mean we have to use JIRA to like coordinate with some other teams. Like sometimes when we file a request we have to make a JIRA ticket. But no, am not a fan of heavyweight task management. I love like being on the same page, being able to keep most things in my head and having a really light way to write down the things that, you know, I can't or the team can't keep in its head.
C
We, we did use Asana briefly, but my memory of it is that it spent you, you spent more time in the meeting grooming a backlog of irrelevant stuff than actually, you know, talking about the. The proper priorities. So I think it's nice in the Google Doc that if something becomes irrelevant, it can kind of just fall off without needing explicit backlog grooming.
A
So just to maybe summarize a little bit of how you guys operate that might inspire other companies to set teams up like this. So I'm going to go through a few things that you shared. One is one person in charge of the team, like the founder almost. They're like basically the founder of the team. They have one very senior essentially sponsor, slash decision maker that they interface with. In your case, Elon, no big deal. In other cases, it could be the CTO, CPO, someone like that. The team is focused 100% on this product and goal. You keep the team very small. So you start with one person of each function. One front engineer, backend, ML person, designer, researcher. Yeah. And then Google Docs almost basically for your project management. Is that roughly like. Yeah, it's basically run it with Google Docs. Stop. Don't use big complicated products.
B
I think that's a pretty good recipe on the Google Docs. You know, take it. People can do what they want if they want. Thumbnails Go for it. I think those, those first ingredients are really, are, are key structurally. And then you know, beyond that it's a matter of having an ambitious goal that gets people fired up to go do great work.
A
Yeah. Awesome. I think there's a lot there that a lot of people kind of like think they should do when they set these teams up, but they don't actually do. And it feels like each of these is just a really key ingredient to actually succeeding.
B
It definitely really helped us succeed. I don't know that the project would be here if it was not for some of those elements.
A
That's a powerful statement. Like this thing that has changed the way the world understands what is true would not have existed if you didn't set it up in this specific way.
B
Yeah, I think, you know, I, I don't know if I would have, would have begun the project had I not known. We had sort of that structure, that ability to make decisions, the autonomy, the, the speed, the ability to go fast and you know, working. We, we started with that in 1.0 and, and it's been continued and if anything furthered in, in X, I mean X as a whole company operates with a lot of those attributes and I think it's, it's one of the reasons the product is successful and I think it's a big, those are big reasons why at least I, Jay can speak for himself. I have so much fun working on this. Like I, I, I love working on it. You know, it's great to wake up every day and solve these problems. We get to, you know, we get to do them efficiently, make decisions quickly, build stuff that helps a lot of people. It's, it's awesome.
C
Yeah, this, this like whether Thermal or Elon way of operating is definitely more fun. And, and the fact that like that combined with the awesome mission is super important for internal recruiting. Like I remember like when I was first chatting to Keith about this back in early 2020, you know, I had another project I was, you know, work on a few, but one was like personalizing the number of push notifications that we send and it was, it drove a lot of dau without like losing opt outs significantly. So you know that, that was like setting me on track or you know, if I had kept working on that I could have probably gotten a promotion from that with low risk or I could take this huge career. I mean it's not as big a career risk as like joining or founding an actual external startup, but there is still career risk I guess. And joining a team like this. So, so just I think all of the same aspects of recruiting that apply to external startups imply internally and, and you know, if you can have an exciting vision that is key related to.
B
That and your list. Lenny, one thing we missed that's super important is that on this project and I think of successful projects like it in startups is that people are self selecting to join. We did not assign anyone to this project. Like people reached out to join or they applied to join the job. You know, I and the team interviewed every single person that joined the team and we're like, we want that person on the team, they want to be on the team. And so people are totally bought in to the goal mission, the way the team works, the other people they're going to be working with and that makes a huge difference. So like a great time to do that is at the start of one of these things. Like don't. If you're going to try something crazy, like I would, it's going to be tough if you're just assigning random people to it. But if you let people opt in and self select, you're much more likely to be successful. And one thing that I have observed at X which really surprised me was that this is also possible at a large scale. You know, one of the things Elon did when he bought the company was he basically asked people to self select to stay. Like you had. You had to click the button and he sent an email out that was like, hey, Twitter 2.0.
A
Like fork in the road.
B
Right, fork in the road, forking the road. Exactly. It's like Twitter 2.0, you know, now X, it's going to be hardcore. We're going to do ambitious things. You're going to work your butt off, you know what, and you had to click on the form and say, yes, I want to join. And I think that was really important for the company because you want people to opt into that. You want the people to be saying like, yeah, that's what I want to do. And the company is going to be a lot more successful. If people aren't sure, it's like better for them probably to go do something else and where they're naturally more aligned and happier. And I thought that was a great approach to taking a large company and getting it down to people who are really excited about working together on a mission. So for us, we did it from day one, which I think is an easy way to do it, but it's possible to do it later as well.
A
I love that you Described it as fun. And I think a lot of people, when they see Elon laying off a bunch of people, being very hardcore himself, people don't imagine it as a fun place to work. And it's clear how much you guys love working on this, like, how fun it is and how interesting it is. And it's interesting to hear that because I think a lot of people don't feel that externally. Is there anything else along the lines of just working for Elon within an org Elon runs that might surprise people about just the way of working that's interesting or surprising, or you think other companies might want to think about adopting?
B
I've always liked lean teams, but this has made me. My experience at X has made me change the way I would. I would think about running a future or, you know, if I were to start a company and change the way I think about starting that company, it would be even leaner than I would have made it before. I've been amazed with just how much the team is able to accomplish with a small group. And I think because of a small group, like when shortly after the acquisition, you know, we had this product called Spaces. It was. It had been in the product before, but it was. It was pretty small scale. And Elon wanted to run these large spaces. I forget who the first people he was going to bring on were, but he was going to be there. You know, ultimately these things have gone on to host politicians and things like that. And he's like, guys, we got to scale this up. I forget the number. He's like, we need to be able to scale, like a million people or something like that. I'm getting the numbers wrong. You need to be able to scale way up. This is the kind of thing at 1.0 that would have taken a year if it had ever happened. And the team did it in like.
A
Two or three weeks.
B
And it was really exciting and inspiring to see. Like, we. I didn't work on that, but I watched it from the outside. I'm like, wow. With this tiny team motivated behind a big goal that was like, hey, guys, it's not like, are we going to do this? It's we are going to do this. They got it done in two or three weeks. That must have felt amazing for them. It was certainly exciting to see, but I have definitely come to appreciate just how lean something can be and not just get by, but actually thrive because it's that lean.
A
I think the point you made about people opting into that is important because I think a lot of People hearing that be like, I would never want to be asked to build something like that in two weeks. And I think a lot of people do and we love that kind of experience, especially working with Elon, especially shipping something at that scale. But I think there's an important element there. Just like, okay, I don't want to do that. I have other things to do in my life other than chip spaces. So I think that's a, I think that's a key point you've raised of just there's an opt in step.
B
Totally. I think the opt in is important and it may even be that you want to opt in one part, you know, at one point in your life and maybe at another point in your life something else is better. I think, you know, whatever it is you're choosing to do, it's nice to be opt in, to feel like it's aligned with how you want to spend your time.
A
Something on my mind, and I don't know if you guys want to go here, but it's something I think a lot of people think about is when Elon came in, he let go of 80% of folks and everyone's just like, Twitter is dead. It's all going to fall apart. There's no way they can run this thing with that small of a staff. And clearly they were wrong. Clearly it's working great. It's like becoming a massive deal in the world and continues to grow. Is there anything about that that you were surprised by or anything about just like how it, it continues to operate so well in spite of that big shift?
B
I think the, the, the leaner team, the, the reduced kind of like process and bureaucracy is the big reason it does move as fast as it does. It's easier to get stuff done faster here. And yeah, I mean, I think that's, I think, I think it's that, that shrinking is actually a big reason for the increased pace of launches, the increased pace of experimentation. One thing that I noticed that as a result of that is the people who are here, they seem to all really feel like owners, like they take the sense of responsibility that an owner takes in the product. They'll try to track down what's wrong, fix whatever is needed, jump into any to help build or fix, improve any system that needs help, even if it's outside of their space. And there's the flip side of that too. For people who've worked at big companies, they may have experienced this thing where there's like, you want to change something in some other system or product and so you reach out to that team and maybe they're a little resistant. They may be like, oh, we'll get to that next quarter.
A
They have their own goals to hit.
B
Yeah, yeah, exactly. Like they don't really necessarily want to help you or they're busy here. You're like, hey guys, we need to do this thing with that other system you work on. And they're like, great, here's the code, here are the docs, send us the fab if you have any questions and we'll get it in. And it's just the thing, you can just jump in and get it done. And that kind of collaborative effort, like the sense of shared ownership, I think from my experience came from was a result of the shrinking of the team down to people who, you know, wanted to be there and work together to build this thing. So I think that's been a really positive impact. It's not always easy, certainly like a lot of people have a lot of responsibilities, but, you know, they're here because they're up for it.
C
Yeah. I think one other thing that's key is when you are forced to have such a small team, you know, deleting. Well, this is important anyways. But deleting code is more important than writing it a lot of the time. So I think so often, maybe due to promotion incentives or just regular human tendency, engineers have a tendency to add these little incremental wins that actually add more of a long term maintenance cost than is clear. Because you just run a little one month, a B test, you see this significant win, you don't realize the maintenance burden you just added to your team for the rest of eternity until you turn the thing off. So I think there's a lot to be gained. And you get forced to do this, by the way, when you have such a small team is just deleting, you know, auditing. Auditing parts of your system and deleting the things where the maintenance cost is worse than the gains. So I think we did have to do this across the company after the big layoffs. And you know, systems are leaner now and they can be worked on by fewer numbers of people.
A
That's an amazing point. I remember Elon's being like, here we have to throw away the whole thing. We have to re architect everything. It's stupid the way it's built. And sounds like that actually worked.
C
Yeah, well, you don't have to rewrite everything from scratch. I mean, some things we did, I guess, rewrite, but I mean, just even deleting the Unnecessary cruft and keeping the rest of the core system.
A
That's awesome. I love that we're creating kind of a formula to run these sorts of companies and teams. There's so much here. I want to go back to the. The building of the original product. It kind of took us on a long tangent and an amazing tangent. But I heard a story of when you launched Birdwatch. At that point, you specifically wanted to keep expectations very low, and there was like a gif in the thing, and it just looked like, clearly this is not ready for primetime. Talk about just how you did that, how you launched it in a way where people weren't like, it's never going to work.
B
We were very disciplined, I guess you could say, about having the product prove itself at. At every given point. You know, when. When we built the first mock ups, we had just. These were just like, pictures of depicting what community notes might look like. We showed those to people across the political spectrum. We saw, like, hey, people really like these, whether they're on the right or left, like, they seem very open to reading these community notes, even when they're critical to people of their own side. So we're like, all right, that gives us confidence that if we can build this, like, if we can actually make this a reality, it's going to work. Then there's a question of, like, can we make it a reality? Like, are. Will people in the real world be able to write notes that are of this quality? And so, you know, we built. We had an internal pilot test version of this where you could, like, write notes. And we first basically ran this through, like, an Amazon MTURK type of test just to see, like, if you just, like, put some normal people in there, like, will they be able to write these notes? And it not. They weren't.
A
All.
B
All those notes weren't good, but, like, it was clear that there were people out there who could write good notes. So then they're like, okay, this is possible. Like, what will happen if we actually do this out in the real world? And like, let's run. Let's run a pilot and find out. And so we took that pilot that we'd run the MTURK kind of test on, and we released it to at first a thousand people totally out in public, and we didn't know what was going to show up. You could imagine the notes could have been terrible. And so we were talking like, well, what do we do? We're going to put this out there. Everyone's going to have all these questions. They're probably going to be really skeptical. And we know it might be a total dumpster fire. And so, like, what do we do to, like, set expectations appropriately? Like, we felt like we could probably get there in the end, but we just didn't know it was going to happen at first. We wanted to set expectations and so we're like, well, why don't we just stick. There's like the page where you see a post in the notes flow. We're like, why don't we just stick a dumpster fire gif like on that page. And you know, you go there, you're like, hey, you know, anything you see below here might just be a total dumpster fire. At least it would show. We. We were aware of that as a possible risk. In the end, we did not do that. It cracked me up. But we thought it was kind of.
A
Like, oh, you didn't actually launch. Okay.
B
We had mockups of it and every time I looked at the mock up, I laughed. But ultimately we had so much to explain on that page. Like, what is this thing and how does it work ultimately? Like, okay, this is probably going to like, distract from the point. So we pulled it. I somewhat, I kind of wish maybe it had seen the light of day at one point. But yeah, ultimately we kept it simple and we focused that page on explaining what was going on here. But again, you know, we, as has happened many times with the project, you know, we put the pilot out there and the notes were good. Like, they weren't all good. It was a mixed bag. But like, there was gold in there. And from the very early days, with just 1,000 contributors, it was obvious that. That people could write notes that were informative, that were neutral, that spoke to controversial, challenging topics, and that if we could just identify those from the rest, this was going to work like, it was going to work as well as the very first mock ups we had made. So that became the focus. That is how do we sift out the gold from the rest.
A
I remember there's a. I think you may have shared with this with me when someone noticed you guys were testing this and they took screenshots and tweeted it. And I think Elon replied, like, this is cool.
B
Yeah, yeah. So in the very early days when it was just a FIGMA prototype, we were running these like, usertesting.com unmoderated studies. And I guess one of the participants sent one to an NPC reporter who like, wrote a bunch of stories on it. Anyway, like, that day, there's a lot of chatter about it on the service. And Elon, this is like this back in, you know, time perspective. This is, I think 2020, so two years before any acquisition stuff happened. Elon is just a Twitter user building rockets and electric cars and other cool stuff, and stumbles on this thing that depicts the prototype that we've been testing. And he writes back, yeah, definitely worth trying, imo. And I remember thinking that was cool back then. And it's interesting to see, like, he's obviously had a very consistent point on it. I think that, you know, the idea was appealing and he, you know, he's obviously been a big fan of it in the product and had been a big supporter proponent. So, yeah, it was kind of. It was kind of cool that it came from. That support has been from the very early days before he was ever involved in the company.
A
I love that moment. It must have felt really wild for Elon to be commenting on this Figma prototype retesting.
B
It was cool.
A
It was cool. Oh, man. So when we were preparing for this interview, I asked you guys, what's the main thing you want to make sure people get and understand about why Community Notes has been so effective? And Keith, you specifically said that it was the principles behind how you wanted to approach this and how you continue to stick to this throughout. And we'll talk about how you kept it alive throughout all these different co changes of leadership. But just talk about these principles, like what the actual principles are and why that was so key to it working out.
B
There are a number of principles that I think when we first shared them with people at the company seemed maybe a little bit crazy. But I think they are the reason the product works. And I think they've been very important. And we do. We come back to them regularly today, all the time. Probably the craziest one is just that this thing is going to be the voice of the people. It's going to represent the voice of people. It's not going to represent the company's voice. So it is not a tech company deciding what shows. It is the people deciding what shows. And that had a lot of implications on the design. Like, first of all, we don't have a button that will change the status of the note of a node. So if a node is showing because the people have rated it and found it helpful, it is going to show like, we can't change that. And that is the kind of thing that, like, for when we first proposed this, that's unsettling to people. They're like, wait, so, like, something can Go up and like we, you know, the company can't take it down or, you know, it can't change its status, get it to stop showing. And we're like, yeah. And like, it has to work that well. If it doesn't work well enough to do that, then it doesn't work. If there's a problem with the node. This is like one of our, one of our key principles was if there's a problem with a node that's so bad you want to do something about it, it's a problem with the system. Like we need to redesign the system to be showing good notes. And so yeah, we had to get everyone comfortable with the idea that there was no button to change the status of a note. Similarly, as we talked about earlier, we wanted this to represent all of humanity. And so we didn't want to be arbiters of who can come in and be a contributor and who can't. Like, so we open it to everyone. You just have to meet really basic objective criteria, like you have to have a verified phone to help reduce the likelihood of having like bots or things like that participating. But beyond that, it's random selection and it still is that way today. And you know, again, that people, it took some time to get people comfortable with it. But I think that the, the fact that this is the voice of the people and reflects their output through an open and transparent process is so key to both why it is good, like why it works, but also why it's trusted. So I mean that's number one and I think will forever be at the heart of the product. Another one that people thought was kind of crazy was transparency. We're like the previous approaches to dealing with misleading info. They, it felt to a lot of people like sort of black box tech companies or media companies or elites or whatever making decisions like they people, people need to get comfortable with this. They need to trust this. So the whole thing has to be out in the open. Like the code that decides what note show has to be out in the open. All the data and ratings that, that make it happen have to be out in the open. People should be able to take the code and data and replicate the whole service and vet that. We have done exactly what we've said we've done and they should be able to audit it. They should be able to go and look and say like, hey, I think this part could be better. Or like if they think we're biased, they should be able to work, you know, work with the data and point it out and if they, if people have good observations that should factor back into the code. And this is again, something that's kind of difficult to get people comfortable with, that everything is out there, you can't cover anything up. But I think that's so essential to people trusting it.
A
So.
B
Yeah, I mean, we set these out on day one. We go back to them constantly because we're always evolving the product and we always got to make sure every new change is open. Like whenever we update the code or update the scoring system, there's an update in GitHub when the data is published daily so you can download it. And so yeah, I think those have been really essential to the thing working.
C
Yeah, and by the way, these, these do not come without a cost. Right? Like the, the, it's actually really hard from an Eng perspective to actually open source the actual algorithm that's running on the actual data. You know, because the, with the way large scale services like this are usually architected does not, you know, naturally lend itself to like being run like as a script by someone who's downloaded a TSV. So I actually have to take weird architectural decisions to make this possible in a way that probably wouldn't have been if we didn't start with this assumption from scratch. We would have had to maybe rewrite the system to make it like this.
A
What's an example of that?
C
For instance, there's a matrix factorization that we train. Usually you would train your ML model once and then serve it, I guess, with a separate service. But we didn't want to have people externally spinning up services, you know, to, to be able to rep. Replicate the, you know, system that we had. So I, I mean, basically I don't think it would have been actually very cool if we had open sourced the code in a way that wasn't actually runnable, I guess, you know, by someone just, you know, at this point you can download Python code and, and run a script. You do need a lot of RAM right now, but can do it on one machine.
B
Okay.
A
Okay. How much RAM we talking about?
C
Oh, like 500 gigs and it'll take like a day if you don't do anything special to speed it up.
B
Good to know.
C
But yeah, cool. Possible is the key thing and people have done it. Like Vitalik Buterin had a blog post where he, you know, talks about his explorations, you know, making sure the algorithm really does what it says it does. And you know, I think just the fact that a handful of people have done this, you know, there's, there's Enough people who have done it that there's someone you probably trust who's. Who's verified.
B
Yeah, yeah.
A
And now it's rolling out to Meta. No big deal. I love just like, as you describe these principles, just. I could imagine a PM at a company being like, okay, guys, here. I want to do this project. It's going to be completely like. There's so much idealism to it that rarely works in real life. Could be open source. You're going to give it to everyone. We don't have actual control over what it's going to do. Don't worry about it. It's going to just change the way people see this thing that we've been very careful about, and then it works. And I think that's very rare and it's really impressive. And what I'm hearing partly is that sticking to those principles was actually really fundamental to it working and not kind of bending over, backward bending over when someone's like, no, no, no, we can't do that. This. What if we change those parts?
B
I think if we had. If we had broken with any of those principles, like, if there was anything black box, if there was, you know, whatever, I. The product would be a lot harder to trust. And so I think it's because we've just stuck to them so cleanly simply that, you know, people. People can't trust it.
A
You've talked about a few moments when it was like, wow, the White House changed their announcement because of a community note. We talked about the dog is a cat. Are there any other moments that after you launched of like, holy shit, this is working. This is gonna actually work all along?
B
You know, we saw it. We saw it working. Like, we were pretty. We wanted to be confident. Whenever we expanded it to new, you know, audiences or new countries or whatever, like, we wanted to be confident it was gonna work. So, you know, maybe you hold our breath a little bit just to see that it would do what we expected, but we always expected that. But that said, there were definitely stress cases. I mean, the one that comes to mind is the start of the Israel hamas conflict in 2023, in October. That was probably the largest deluge of misleading information I've ever seen shared on the Internet. At one time, it was, like, overwhelming. A number of photos and videos and whatever coming out related to that was. It was insane. And just to, you know, to give you an example, in the first, I think it was like first three days or something of that conflict, we had 500 notes covering all sorts of different, you know, like out of context imagery. Like people someone said like, hey, this is happening here. It's actually from like 2013 in Syria. There were people making fake like battle footage in the video game simulator Arma 3. So there were like notes explained. The stuff looked really, looked realistic and unless you saw the note, you like wouldn't really know. There were all sorts of claims about what was going on in the ground and you know, that was definitely. The product was still pretty new at that point. We had only. We'd expanded in the US less than a year before that. We had been rolling out throughout the world that year and then this large event happened and you know, I felt like we were, we were just enough prepared at the right time for the system to be able to handle that. Like probably one of the most important things we did right before that was launch the ability to write notes on images and media and have images and videos and have those matched to other posts. I remember at that time thinking like, wow, I'm glad we launched that feature a few months ago versus still had it on the shelf because it was really important in that conflict. And I think even it was like just a few weeks before we had launched a major speed up in notes too. I think like when we, when we first built the product, the number one focus was always quality. Like we knew, we knew that notes the product would live and die by the quality of the notes. And like that was the thing we could never give up on. We also knew it needed to deliver speed and scale. But we're like, we will was get the quality in the right place and we can speed, we can speed it up and scale it out over time. And we had actually just launched a speed up that took like three hours off the time it time a note needed to go live. And I was, you know, I think a matter of weeks before that conflict happened. So again like super glad that was out there. The in the first few days of the conflict, the median time from a post going live to note showing up was five hours, which is like crazy fast. Typical fact checking is like two to four. It's really common to see it take two to four days. These notes were showing up in five hours. And we're like, we are so glad we got those things out before this happened. It made the service a lot more helpful.
C
One other thing that was, I think nice to see working then was one criticism of community notes some people bring up is, well, if you always need agreement from people who typically disagree, then in these super polarized settings like that Conflict being like probably number one then. Then you know, you wouldn't see any notes. But actually the reality was there were these, you know, tons of notes about that conflict. So I think there is this kind of, you know, nice, nice property where actually, and maybe this is a surprising fact that there's more agreement out there across polarized divides than maybe conventional wisdom says. Right. And the places where people agreed were really objectively true and verifiable. Like I guess maybe this is more true the more polarized the setting is, but where the, the agreement actually lands you in basically notes that are very neutrally written, very focused on the facts and, and easy to verify information.
A
I love just like there's this, this talk for a while of just like no more. There's no more facts. Like nobody believes there is a single true fact anymore. Like everything is subjective and I think Community notes is exact. Proves the opposite. Yeah, facts matter. There are facts that we can all agree with even in the most controversial topics.
B
Yeah, yeah, we, we saw this really from day one. It's, you know, we. When we would show those prototypes to. Of people just depicting the idea, it was really obvious that people cared more about, or they cared a lot about getting understanding reality and what was going on. And they were willing to disagree with like their side, so to speak, to recognize that. And I think that's not always that obvious to people. The world does feel really polarized, but people definitely are willing to cross partisan boundaries to, to get to, you know, accurate information. And that's why the product works.
A
It feels like as we rely more and more on what we know and understand about the world is becoming social media online and moving this quickly. It feels like it's like I'm so thankful this exists because otherwise it just be what do we trust anymore? This being out aligns with. We need this thing to exist at the same time. And it feels like at the same time there's also this people just like I just don't trust. I think people have shifted from I trust what I read to okay, I shouldn't just believe everything I'm reading. Is there anything there you're noticing about just how people think about news they see and their shift of just like I'm not going to believe everything. Is there anything that you've noticed about just like human behavior? Just the way we've shifted understanding what is true.
B
We haven't done any research, you know, to look broadly at how people's perceptions are changing there. But I certainly have found, you know, myself that particularly seeing Notes, I am more skeptical about what I read at first, and I think that's been helpful. And we hear that from people that they think about things a bit more. And, you know, I think that's a good secondary effect and benefit of something like this, which is the more you see the patterns of how what you're reading can be wrong, the more you can thoughtfully question it and try to get a better understanding what's really going on. So it's like, you know, historically, I think this was called like, media literacy, but basic idea of, like, can you understand the ways in which things can go wrong and try to vet them yourself?
C
And another aspect I think we help with that is discovery of the community notes. I think often, you know, if before community notes, you could have just been living in a little news filter bubble, or maybe there were fact checks out there that you should have been reading, but you weren't discovering them. Right. So the fact that the note applies is directly attached to the post and visible by anyone who sees the post helps, you know, cross those filter bubbles and can kind of. I think for some people, it's the first time they've actually seen counter arguments to, you know, to claims made in their own little echo chamber.
A
That's incredible. Yeah, I love the point you're making about how it actually teaches people to be a little more skeptical of the things they read. Like, it's an education system. More than. More than just here's this one thing is wrong. I love that. Okay, just a few more questions. There was an audience question I asked on Twitter. We all asked on Twitter, what do people want to know about Community Notes? One was actually why you guys switched to anonymous contributors. What was the decision behind that?
B
Yeah, that was, you know, we had this pilot where we were testing the. With a small number of contributors, like a few thousand contributors. And we learned a lot through that pilot. Probably the biggest thing we learned was related to anonymity or pseudonymity of contributors. We had originally assumed that it was important that people contribute under their real handle or their real name or whatever it was. We actually, like the first prototypes, depicted that we kind of thought that would be important for people trusting the note. And actually it was totally wrong. We were like, the best option was actually the opposite of what we first tried. We found a few things. One, people were hesitant to write a note on a controversial topic because they didn't want to get, like, attacked or harassed online. And so some people were comfortable doing. Doing this, but others were not. And so it meant there was more Potential good notes to be written than were getting written. And this is very clear feedback from the pilot two. And this is super interesting. People are actually more willing to cross partisan boundaries when they are anonymous or pseudonymous than when they are under their real name. And it intuitively makes a lot of sense. Like if you publicly are using your name you feel are affiliated with one side versus the other, you might hesitate to kind of like be perceived as breaking with that side. But you may actually, for example, find a note helpful that's critical of that side. And there's a bunch of studies that show when people are anonymous, they're much more willing to cross boundaries, partisan boundaries, and work with the other side. Agree with the other side. And, and we saw that too. And so by allowing people to just to be pseudonymous, you actually get more honest answers about what they really think and it helps find disagreement.
A
That really so counterintuitive. You know, you never hear the opposite always. And it's so interesting. That's the opposite.
B
Yeah, yeah.
C
I think this, the, the same principle applies to making the likes private.
A
I was just thinking that. Yeah, yeah, I like a lot more stuff that's a little.
C
Yeah.
A
Stuff I wouldn't have liked.
B
Yeah, it allows, it allows freedom for honesty, which is pretty great. And one of the criticisms of a pseudonymity is like, you know, it can generate maybe like people have reached their quality, the quality threshold that they put out there. But we have so many quality mechanisms in the system that that wasn't an issue. So we could keep quality high while opening up for that honesty.
A
Another question you touched on this a little bit which is around navigating the existing trust and safety apparatus of Twitter which to as you described basically previously, it was like we make decisions on what is true and not. And there's. Every company works this way. You guys basically upended that like here's a completely different way. You have no control over what is what we say is true or not. Talk about just that experience of kind of overcoming that I imagine very difficult hurdle of like, okay, forget all that, we're going to do it totally different.
B
Yeah, it was definitely what we were proposing was very different. I will say that I think people were sort of open minded to it, generally speaking. And I think everyone had a sense that what was being done at the time wasn't really working that well or solving the problem. And you know, people were open to new ideas. So that was, that's like a good foundation. But I think one thing we did that was probably very helpful and that is we wanted the product to prove itself at any point. Like, first it had to prove that people could possibly, you know, find notes helpful. Then it had to prove that people could possibly like write these notes that would be good quality. And so anytime that we were proposing doing something with the product, like running some research test or running the pilot or expanding the pilot, we always had the data that had convinced us that that was a good decision, like we were stepping into the next phase of expansion that made sense. And so I think we probably rarely proposed anything that seemed unwise because we were holding such a high bar for quality ourselves. And I think, I suspect that went a long way.
A
So it's partly what I'm hearing is like, take it step by step to prove this is actually working and partly be confident it is working your to yourself before you try to convince, say the trust and safety team. This is the way to go.
B
Exactly.
A
Is there any. Was there like a moment along that journey of just like, okay, here, this like it shifted from no way, this is a thing to okay, wow, let's actually consider this. Or is it this very gradual process.
B
Whether other people were saying no way to violence.
A
Yeah, just internally of just like, okay, we're gonna actually complete stop this trust and safety way of operating and instead rely on community notes. Was there like a moment of like, okay, let's actually make that switch? Or is that Elon. Actually, is that the big switch?
B
The biggest change there happened in X. The biggest changes prior to that were just the decision to put this out there and you know, and have it be operating at, you know, in public at first us wide scale. But yeah, then the bigger switches came in the X period.
C
I think even though there was original research before Birdwatch even started or community notes even started from external researchers showing that crowdsourced fact checkers can do. Lay people can do about as well as fact checkers. And actually the agreement rates were kind of similar between the groups. I think even though that research was out there, I think there were definitely a lot of people who didn't really believe it could work until it already worked.
A
Basically prove it. Prove that it works. And yeah, that makes sense versus just a bunch of docs and strategy and thinking it's just like, look, it's actually working. You can see for yourself. Makes sense. Okay, possibly last question. We'll see what fractals of questions you guys bring up here. I've referenced this a couple times. This incredible achievement of keeping a project alive through Jack. And then I have this note and Kayvon Running the show then, and then Parag, running Twitter, and then Elon, and then Linda taking over CEO. Quite rare. Especially something this visible, this impactful to everything that X is. Any lessons or keys to that actually working of this project? Surviving throughout so many work changes and leaders.
B
It definitely has been a crazy time to be building something. It's been fun. The craziness has been entertaining. I think one reason perhaps the product has done so well and survived is the product itself. The nature of the product itself. It is designed to produce information that is found helpful by people who normally disagree. And so even if you have CEOs or leaders who might disagree, there's a good chance actually they'll find it helpful. They'll be like, wow, this thing does produce pretty useful output. So I think there's something in the nature of the product itself that when people see it, whatever side they're on, left, right, what are up, down, they're likely to find it pretty helpful. So I do think that helps. I also think the team executed really well. We had ambitious goals that were exciting. They solved a real problem. This is a real problem that matters in the world at every step. You know, as we talked about, the product needed to prove itself and we would make sure it proved itself and we would bring the results that convinced us and we'd share those with people. And so they would say, oh, yeah, I agree, it kind of proved itself. Let's take the next leap. We've done that all along the way and we continue to operate that way. And I think that focus on the outcome and goal that matters and executing against it really helps. The team did not get distracted by much. All through the period during which the acquisition happened, there was a lot of opportunity for distraction. This team was shipping every week. We were super focused on the goal. Let's make this thing work. Let's get these nodes out there. And I think people saw that execution and were excited to support it.
A
It's working. Why would we mess with that? And it's important and it keeps us from having to hire tens of thousands of people to fact check.
B
An interesting thing about that is no one ever asked us or brought up or seemed to care about anything related to cost savings in this process. And I think that's like a pro, like an assumption people have outside the company that, like, this must have been a reason there was interest in it. But, like, that was never a goal. It was not at all why the project was started. It was not why people were excited about the project. And I think that's also, you know, for people outside who maybe don't see the conversations, like kind of a heartening thing to know is that the focus was always on solving the problem. The other approach is even if you had 10,000 people doing it, like it, the real issue is that they don't work that well because they're not trusted or they don't scale or they're too slow. And so the goal was really always just let's like help people stay informed at scale. You know, let's build an Internet scale solution to an Internet scale problem that people like.
A
Something I heard about you, Keith, when I was asking people about how this worked and why this worked so well, is that they describe you with having a very low ego and that allowed you to give up this whole team and power influence and just the name, forget it. Whatever you want, we'll call it Community. That's great. Is there anything in there you can share of just like how you think about that and how important that is as a product leader to have a low ego?
B
You know, for me, this, this project, I feel like I get to do community service with this project. Like my, I like, I see my work is in service of the people in the community and like, that's what motivates me. The only thing that I care about is delivering the outcome that the world finds helpful. And so in some ways the project has not been about like ego or results. It's like, about truth seeking. Like, let's find not truth in the sense of, like, what information is true, but like, let's find out what's actually going to make this work. Like, how does it need to be, how does it need to be structured? What is it? What should it be called? Like, whatever is going to produce the best outcome is what we should do. And so I think, you know, I feel more attached to the product being helpful than to anything else. And so, you know, to whatever degree might seem like low ego is probably more a result of wanting to actually solve the problem.
A
And I think, I think partly what I'm hearing is just if you win and succeed, good things will happen. So focus on that.
B
Certainly satisfying things will happen. It's very satisfying to have people appreciate it. It's satisfying that, like, people on the left and right, you know, love it. It's satisfying that even people who receive notes, love notes and screenshot them and post them, like, that's amazing. It feels, it feels, you know, so good to have helped give people that. And, and yeah, you know, it's very motivating It's a great reason to wake up in the morning.
A
It's absurd. This has worked. But it's also like, of course this would work. Of course something like this should work. It's like such Internet.
B
It's of the Internet, you know, that's why it works.
A
Oh, man. Where's community nets going from here? What should people look? What's happening? Where's it going? What's the future?
B
We're always working on basically more better notes faster. So we want, there's clearly opportunity to get more notes out there. We want to want them to stay as good or better than they are. We want to get them there faster. So we're always working on like core product changes to help deliver that. Like recently, for example, we just released an update to what we call the community notes bat signal, or the ability to request a community note. So we, anyone on X can say, hey, I think this post needs a community note. And now they can even add a source explaining why, so that when a prospective writer sees that, it's much easier for them to write a note. So we're always working on core things like that, core algorithm improvements. I think there are also new frontiers that show a lot of potential. AI and LLMs are one. It's easy to imagine a lot of ways that AI could assist the people in this task they're doing of trying to get information out there quickly. And maybe Jay should talk about the supernotes work that we've done with some folks outside the company.
C
Yeah, so, I mean, one cool thing about having public data encode is that external researchers can collaborate with you. And in this case, the supernotes of this idea that we can basically take existing notes as input. Existing proposed notes that aren't actually, maybe they have some problem, maybe they have the part of the story, maybe they're worded in a biased way. Basically take all these in, have an LLM, generate a ton of different variants, and then basically make the simulated jury to basically get a representative group of contributors for community notes who would be writing the note and try to predict based on their pass ratings, how they would rate these LLM generated notes. And so this way you can actually, you know, rather than just like having an LLM write a note from scratch and hoping it's good, you can kind of like simulate the entire community notes rating process and, you know, explicitly create notes that are likely to be rated helpful by people. So I think ideas like that are very promising for the future. And it's a nice way that LLMs and humans can work together. I think obviously agents can browse the web too. And that's one way that you could imagine agents assisting humans is maybe checking whether a source is actually supported by the note or a note is actually supported by the source. Although then you get into things like, well, you know, are people going to actually be as diligent? You know, right now I think raters are very diligent because they know just some community notice contributor wrote this, like, I better check this before I write it. Helpful. But, you know, hopefully people, you know, we can design things in a way such that people don't trust the output and actually verify it themselves before issuing a helpful rating.
A
Yeah, that is such an interesting area to explore where you want to avoid AI hallucinating, sloppy slop versus make it easier and scale it even further. What an interesting challenge.
B
What's cool about this project, in addition to this, the AI element, is that it's being done outside the company. Like we talked earlier about the open source transparency. The key reason we made this all open source was so people could see how it worked. But the dream is actually that it's the. It's not just that the contributions to the notes and ratings are from the people, but the dream is actually the product is built by the people. Like, what if the scoring algorithm were significantly or entirely written by the public? That would be incredible. And supernotes is probably the first very substantial potential change in the algorithm of the way it works that was kind of coming from the outside and plausibly could be part of the core. So we'd love to see the product go in that direction as well.
A
Sweet. Go. Super notes. Well, guys, the work you're doing is tremendous. I think this is every product person's dream, I think, to work on something like this small team, lots of support, lots of impact, like, just like innately interesting. And so I think this is going to inspire a lot of people. So let me just ask you, is there anything else you wanted to share? Anything else you think might be helpful folks to leave them with?
C
Sure, I guess one thing that just I thought was interesting over the course of working on this product is just there's, I think in a similar way to how retweets originally were not something like Jack came up with. I think users just started doing it and then it became a core part of the product. There's a huge way already in which there's just a lot of surprising things that people wanted to use community notes for that I don't think we really expected and it's kind of cool to see those user desires kind of emerge. I think, like, one example, I guess we'd have always been imagining political type of misinformation, but for whatever reason, there's like a lot of people who love debating whether Messi or Ronaldo got more goals. I guess, uh, it's kind of. Kind of a funny one. Uh, the. There's a community moderation aspect. Right. So, like, I think we also thought that, you know, this would be specifically for adding context to misleading or potentially misleading information. But what you can see is that there are some notes that go beyond that towards, like, calling out content that they think is spammy or something. So I, I think that's. That's just, I guess another. Just another dimension in which Community Notes is a product that's like driven by the people itself.
A
That's so beautiful. Basically, they're trying to keep Twitter slash X healthy and they're just like, no, this should be taken down, this tweet of spam. Yeah, I love that. Is there an answer on the Messi versus who was the other Ronaldo. Ronaldo. Okay. Is there like a definitive fact there? It's that. Is that just unknowable?
C
Yeah, I guess that's an interesting one because it's a case where Raiders are actually very polarized. I guess it actually kind of fits into the core algorithm where there's some people are just die hard, messy fans or Ronaldo fans, just like they could be on politics. So we actually specifically model that topic as well as some other topics, so we can estimate people's opinion on that particular debate. It's kind of, kind of funny that something like that would emerge.
A
But you're saying that's the most controversial topic on X? Ronaldo vs Messi.
C
That's a controversial one.
A
Oh, wow. Who knew? Okay. Keith, is there anything you wanted to add?
B
Yeah, you know, Community Notes is cool itself, but I think what it points to about society is actually even bigger. Society often feels really polarized. You hear people talk about it all the time, like no one can ever agree on anything. But actually, like, Community Notes shows you, people really can agree on quite a lot. Even on super controversial topics related to politics and everything. Like, there's a lot of agreement. That's why notes work. And I think that's a really big reason for optimism about the world. Is it? While it might feel polarized, there's probably like an 80% set of people that agree on quite a lot of things. And imagine if we could use the same kind of approaches we use with notes, but to find agreement on legislation or policies or things like that that people want the government or the world to do. Possibly we could get a lot more momentum behind these ideas that the people really want and everyone would be a lot happier. Maybe 10% of the people on the edges wouldn't be happy, but I bet there's a lot of agreement that we are not identifying and if we did it, we'd all be pretty happy. So I don't know. I think it's easy for people to feel pessimistic about the world, but I think this is this product is a good reason to be optimistic about the future.
A
What an incredible way to end it. I can also see, Keith, why people want to join you and work with you and work on this team.
B
Appreciate it. You do want to join. We are hiring an ML engineer. You get to work on these amazing problems with us and have a lot of fun. So we're accepting applications@x.com communitynotes okay, great.
A
I'm glad you gave the URL. Oh man, you're about to get flooded. Guys, thank you so much for doing this. Is there anywhere other than that place to go off, join the team as ML engineers or any other place you want to point people to either your socials or anything else?
B
I'm Kate Coleman on X. Please reach out if you have any feedback or want to help us out. Whether you're going to want to work here or want to do something from the outside, would love to talk.
C
Yeah, I'm at Underscore J A Y baxterx. Yeah, I think in particular, you know, besides just using Community Notes, it would be great to get more substantial contributions, you know, like pull requests, collaborate on projects like Supernotes. I think that's the most exciting type of stuff. People do want to contribute.
A
Ship some code guys.
B
Yeah.
A
Amazing guys. Thank you so much for doing this.
B
Thanks for having us.
C
Thank you so much.
A
Bye everyone. Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify or your favorite podcast app. Also, please consider giving us a rating or leaving a review as that really helps other listeners from find the podcast. You can find all past episodes or learn more about the show@lennyspodcast.com See you in the next episode.
Guests: Keith Coleman (VP of Product), Jay Baxter (ML Lead)
Host: Lenny Rachitsky
Date: February 27, 2025
This episode gives a deep dive into Community Notes, X’s (formerly Twitter) crowdsourced approach to combating misinformation online. Host Lenny Rachitsky interviews Keith Coleman and Jay Baxter—the key leaders behind the product—to uncover the product’s philosophy, algorithm, growth, operating principles, and its evolution under several company regimes. The discussion explores how Community Notes was conceived, how it actually works, the product and team’s survival through multiple leadership transitions, and tactical lessons for product leaders everywhere.
[00:09], [06:18]
“We actually look for agreement from people who have disagreed in the past. And what we see is when people have that sort of surprising agreement, that’s what makes the notes so neutral, accurate, and well written.” — Jay Baxter [00:17]
[07:22],[17:35],[23:25]
“We want all of humanity to participate.... If we have all of humanity, we have the data to understand what notes will be helpful to actual humanity.” — Keith Coleman [00:35]
[12:18],[24:49],[74:47]
“Users should be able to make up their own minds. Here’s extra context. Take it or leave it.”—Jay Baxter [10:46]
[13:38],[28:29],[78:26]
“The notes just totally take the wind out of these stories... At 50 to 60% per generation, the virality quickly goes to zero.” — Keith Coleman [28:29]
[40:37],[45:14],[53:50]
“If I had stayed running a large PM team... I would have produced 16 more pages of OKRs. Building Community Notes has had way bigger impact.” — Keith Coleman [37:42]
[71:16],[74:47]
“For me, this project...is community service... The only thing I care about is delivering the outcome the world finds helpful.” — Keith Coleman [96:19]
[98:03],[99:13]
Community Notes stands as a testament to what small, principled, volunteer-driven teams can accomplish in combating misinformation at Internet scale. Its success rides on radical transparency, openness, belief in the wisdom of crowds, and relentless evidence-driven iteration. The approach is now influencing not just X, but the wider industry, and offers a new blueprint for teams who want to build trust, wield real impact, and create products the world needs—while loving the work they do.
“You don’t have to be a big, well-known person to shape the discourse and information flow in a way that’s helpful.” — Keith Coleman [24:49]
Want to contribute or work with Community Notes?
[Ad breaks, lengthy intro/outro, and non-content segments have been omitted.]