Loading summary
Sponsor/Advertisement Voice
This message comes from NPR sponsor Informatica from Salesforce. Everybody's ready for AI to help with the next big breakthrough. Except your data. Get your data. AI ready@informatica.com AI Informatica, where data and AI come to life.
Mary Childs
This is Planet Money from NPR. Alexi Horowitz, Ghazi, Mary Childs. Yes. You and I took a little trip up to scenic Montreal, one of the jewels of French Canada, for a little Planet Money mission.
Alexi Horowitz Ghazi
Yes, we did. And even though it's a little bit sad that that mission did not entail joining the maple harvest or, you know, like, infiltrating a poutine cartel, next time. Dare I say next time, it did have much bigger implications for anybody and everybody whose life is impacted by science, which. Which I think is basically all of us.
Mary Childs
I think that's right.
Chishiya (Researcher)
Yeah.
Mary Childs
We were there to meet a guy named Abel Brodeur. Abel's this very energetic economics professor in his late 30s at the University of Ottawa. And we found him bounding around the halls of this modernist school building in downtown Montreal. He was getting ready to host an event. He's become sort of famous for something called the replication games.
Abel Brodeur
It's getting exciting now.
Alexi Horowitz Ghazi
How are you feeling?
Abel Brodeur
I'm feeling good. It's the beginning of the event. So this is the moment. I'm full of energy and full of enthusiasm. In seven hours from now, it's going to be a different conversation.
Alexi Horowitz Ghazi
Abel is going to be tired in seven hours because at a replication game, he is running around between 16 teams of three to five people in a kind of hackathon. People will work all day to replicate recently published social science papers, to reproduce the results and see if the findings hold up.
Mary Childs
Because ever since technology has made it easy to crunch data, we've been able to go back and check old research, and turns out it wasn't great. Rerunning an old study today, a lot of the time does not yield the same result. The research no longer proves its conclusion. And the same thing often happens when we reconduct whole experiments altogether. These problems have become known as the replication crisis.
Alexi Horowitz Ghazi
A lot of people across academia have been trying to fix this so we can trust research so we can actually know what we know. And this event, the replication games, it's part of A Bell's attempt to help solve this crisis.
Abel Brodeur
The idea is to change norms through monitoring. And just giving a small percentage, a small chance that we will monitor can massively change the behavior of everyone, you know, change the way they behave, change the way they code change the way they do research. So that's the goal.
Mary Childs
After a few minutes, we head into a big lecture hall where Abel takes center stage.
Abel Brodeur
All right, folks, we're going to get started. Welcome to the Replication Games. Thanks for being here in Montreal with us. Let's get started. Today we have 16 papers that are being reproduced. Couple of small things around the room.
Alexi Horowitz Ghazi
Dozens of social scientists are gazing up at Abel, looking a little bit nervous. Most of them have come from across Canada, and most of them are first timers who now have to undergo this kind of awkward initiation.
Abel Brodeur
Right, I'm gonna put the music because I know you guys need, like, you know, a bit of motivation, but you need to do the body movement. Everybody has to do it. All right.
Alexi Horowitz Ghazi
Does it sound good?
Abel Brodeur
So we do it. I need you to do. It's pretty easy.
Mary Childs
Bell starts didactically clapping like an elder millennial camp counselor and his audience joins in.
Abel Brodeur
Guys, thank you so much for being here. I hope you enjoy. This should be fun. And thanks, everyone.
Alexi Horowitz Ghazi
Hello, and welcome to Planet Money. I'm Alexi Horowitz Ghazi.
Mary Childs
And I'm Mary Childs. Over the past couple decades, the world of science has been stuck in an existential crisis over whether we know the things we think we know. It started in psychology, spread to medicine and economics. Now people across disciplines are trying to figure out how to solve it.
Alexi Horowitz Ghazi
Today on the show, the story of one economist. How he set out to learn what exactly has broken in the way social scientists create new knowledge. And how he came up his own daring and kind of wacky way to help fix it. By building an internationally crowdsourced surveillance system to keep social scientists honest.
Sponsor/Advertisement Voice 2
This message comes from Capital One. Capital One offers checking accounts with no fees or minimums. What's in your wallet? Terms apply. See capital1.com bank for details. Capital1NA member FDIC.
Sponsor/Advertisement Voice
This message comes from BetterHelp. President Fernando Madera shares BetterHelp's commitment to expanding access to therapy.
Abel Brodeur
Our State of Stigma report helped us understand that believing in mental health is easy, but asking for help is not. Now, with the report on our hands, we can work to make mental health care more accessible.
Sponsor/Advertisement Voice
To get matched with a therapist, visit betterhelp.com NPR for 10% off your first month. This message comes from Instacart. Let's talk groceries. Specifically your groceries. With Instacart, you want your groceries just the way you like them, right? Well, the Instacart app lets you do just that. They have a new preference picker that lets you pick how ripe or unripe? You want your bananas. Shoppers can see your preferences up front, helping guide their choices. Instacart. Get groceries just how you like.
Alexi Horowitz Ghazi
Okay, so the replication crisis has been a pretty big deal for almost 20 years at this point. We've covered it on Planet Money before. The story of how economist Abel Broder first encountered the problem and why he set out to help fix it begins back in 2011.
Mary Childs
Abel was getting his master's in economics, and he was writing a paper on whether smoking bans in restaurants and workplaces actually made people smoke less.
Abel Brodeur
He.
Mary Childs
He collected this huge data set.
Abel Brodeur
I had, like, amazing data from the cdc, which is public. I had smoking prevalence at the county level.
Alexi Horowitz Ghazi
Abel says that all the established research at the time indicated that smoking bans were hugely effective, that they'd gotten lots of people to stop smoking.
Mary Childs
But when Abel crunched his numbers, I
Abel Brodeur
was finding absolutely no effect. None. There was, like, nobody to stop smoking. I've played with the data for six months, and I find nothing.
Alexi Horowitz Ghazi
And Abel was trying to make a name for himself in academia, which means getting his research published in an academic journal. And it's harder to get published if you find no effect, especially given that the existing literature did show an effect. So what Abel needed was something statistically significant.
Mary Childs
For the statistically uninitiated, significant means the result would be produced by chance less than 5% of the time. So the probability that the result is just random is 5% or less. That is the cutoff for whether your findings count or not.
Abel Brodeur
There's this 95%, 5% cutoff that really matters. We're obsessed with these thresholds.
Alexi Horowitz Ghazi
So Abel kept tinkering with his dataset, changing his computer code to contort the data one way and then another, until eventually, one day, he found a way to analyze one subset of his data that gave him what he'd been looking for. A result demonstrating that smoking bans had decreased smoking and a result that was significant.
Abel Brodeur
It was like, there you go. I was so happy I was in the library. I just yelled like, significant. I was so happy.
Mary Childs
Finding a significant result meant that if his paper was published, he would get to put a little asterisk or star next to his results. And the more statistically significant the result, the more stars you got to claim.
Alexi Horowitz Ghazi
But Abel's happiness did not last long, because the more he thought about how he'd gotten that significant result, the more it started to seem like it was working against the whole thing. Goal of social science, you know, to actually discover true new Knowledge about human behavior, for example.
Mary Childs
Policymakers need to know whether smoking bans work to make sound policy decisions. But here he was torturing the data to match the preconceived hypothesis. He thought, this is so stupid.
Abel Brodeur
What am I doing? I'm writing a piece saying that smoking bans are decreasing smoking prevalence because I managed to find one that worked. I was like, this is dumb. I'm doing something wrong.
Alexi Horowitz Ghazi
Abel ultimately decided not to use his tortured results. He wrote up his paper showing that he'd found no effect, even if it meant his paper was less exciting. And at first he thought what he'd done to his data might have just been a one off mistake on his part.
Abel Brodeur
But then you start talking to other students and people were like, oh yeah, that's how you publish.
Alexi Horowitz Ghazi
Abel started to see that this was a problem of incentives. In order to advance their careers, academics have to publish papers in peer reviewed journals. And the journals want to publish work that's statistically significant and novel. These papers can win big prizes and define new research agendas for decades.
Mary Childs
But because of all that, people were doing what he had done, trimming and squeezing and coaxing the data towards significant results. And that can easily cross over into a kind of data manipulation called P hacking. P as in probability. And Nabelle says it can happen almost
Abel Brodeur
subconsciously because the project took like three, four years of back and forth between co authors discussion. Then six months later you go back, you exclude again these other people, you do something different and then over time, all these decisions. Actually when you look at it from the outside, it's like, this is crazy what you've done.
Alexi Horowitz Ghazi
To figure out how widespread this problem might be, Abel decided to research the research. He and a couple of his colleagues scraped the significance data from a bunch of the top academic journals. The distribution of stars that published researchers had racked up. And when they looked at the distribution, they found a noticeable hump just above that 5% significance threshold. Now some of this could be because some people whose research only hit 6% didn't bother submitting. But it could also be because some researchers were tweaking their data analysis to just barely get results that would be more likely to get published.
Mary Childs
But when Abel and his colleagues started submitting the research for publication, they got a resounding series of nos. Academic publishing seemed hesitant to open up an empirical reckoning. After a few years, they did manage to publish their paper in 2016. They called it Star the Empirics Strike Back. Do you get it? Oh, you definitely get it. Thank you. Alexi. So Abel puts aside this whole idea of an empirical reckoning, and he moves on to other economic projects. He gets tenure, and eventually he learns that his little paper has become kind of a sleeper hit.
Abel Brodeur
It took a long time before I realized actually the paper was, like, well known before. People started talking to me at conferences like, are you the Star wars guy? That's a moment like, I needed someone senior to tell me, like, no, this is really important, what you're doing.
Alexi Horowitz Ghazi
There had been efforts to solve parts of the replication crisis. Some of the top journals had started asking their contributors to release replication packages with their papers. That's basically the data and code they'd use to find their results. And researchers were also starting to pre register their hypotheses before actually doing the research, so that if the data didn't support it, they couldn't futz around and pretend like they'd been looking for something else all along.
Mary Childs
For his part, Abel wondered if there was anything he could do, like not just study the problem, but actually help fix it.
Abel Brodeur
How do I change the incentives? How do I potentially have an impact on the norms, how people do research? The second I think about the norms, I think about, oh, it needs to be large scale. Nobody's gonna change their behavior if it's a small scale thing. So it needs to be big.
Alexi Horowitz Ghazi
Journals do have peer review systems where they try to poke holes in research, but they didn't always totally get under the hood to scrutinize all the code and data. So researchers weren't necessarily worried that their stuff would get checked.
Abel Brodeur
A nice analogy, I think, is imagine you go on a date. You might shave, might take care of your body, you might take care of yourself. A bit of deodorant, you know, perfume, maybe if it's your thing, you're gonna make an effort to look prettier than you are usually the other person Fully understand that this is a nice version of you. We're fully aware of that, but I don't know by much, and perhaps it's not. Or maybe you made a massive effort and usually you're a disaster. You never clean nothing. So when, you know, you go to the apartment, it's like, oh, my goodness, this is your apartment. So research is a bit like this.
Mary Childs
The published research is the cleaned up version.
Abel Brodeur
So when I see a published paper, I know it's been, you know, it's beautiful, it looks nice, but there's an information asymmetry. I don't know how dirty it is.
Alexi Horowitz Ghazi
Actually, Abel thought one thing that might help this problem was to make researchers care as much about the cleanliness of their data analysis as the significance of their results. And to do that, he'd have to go full on room raiders on people's published papers to shine a fluorescent spotlight on the backrooms of their research. If you could take all of the data that somebody had gathered for a given paper and meticulously retrace their coding steps, you could see if it was possible to replicate their findings. You could make sure there weren't any errors, conscious or unconscious, in what they'd done.
Mary Childs
But first he'd have to get the code. People weren't in the habit then of publishing all their data and code. And when he emailed researchers asking, nobody responded. So he decided to create an official seeming institution.
Abel Brodeur
It needs to be a big institution with a website with tons of famous people on it. And when you send the email, people will be like, what the hell is this thing? I need to respond. It's legit.
Mary Childs
So in 2022, he creates a website for a thing he starts calling the Institute for Replication.
Abel Brodeur
A friend of mine, his wife did the logo for free, like a design, like, you know, I mean, like just bare bones.
Alexi Horowitz Ghazi
He recruits some serious famous economists for the board to put on his legit looking website. And pretty soon he does start to get responses to some of his emails. He's able to get some data sets and coding packages. And he convinces some colleagues and junior researchers to start doing some replications one by one in exchange for a co author credit on one big paper so
Mary Childs
Abel can get the data and the code. But there's still a second problem, which was the question of scale. Replicating one paper at a time was not going to do much to change the system. What he needed was to create the sense within the academic community that anybody's work could be checked at any time.
Alexi Horowitz Ghazi
It's like an IRS for the ivory tower.
Abel Brodeur
So now I thought, okay, we need to mass reproduce journals. So then I was, okay, I need to get maybe a few hundred replications or reproductions per year. So now I'm thinking, how do you do that?
Alexi Horowitz Ghazi
The answer, Abel says, came to him kind of by accident. Around the time he got his Potemkin website up and running, he got an unrelated invitation to a conference in Oslo, to a couple of seminars. He was planning his trip about a month ahead of time, and he noticed that he had seminars on a Wednesday and on a Friday.
Abel Brodeur
And I thought, like, what the hell do I am I going to do on Thursday? Like, I've never been To Oslo. I'm sure it's pretty and nice, but a full day, like I'm gonna walk around and then I'm gonna have like six, eight hours just to relax. So I just emailed the person who invited me and I said, could we just like do a small workshop?
Mary Childs
It would just be like 10, maybe 15 people. Abel posted about it on social media.
Abel Brodeur
You can come to Oslo. It should be fun. If you come, you're gonna get co authorship to a meta paper, we're gonna reproduce papers. Let's have fun. And then, I don't know, like 70, 80 people ended up registering really fast. I closed registration because I have no money, we don't have food. I didn't tell the guy it would be 80, I said it would be 10.
Alexi Horowitz Ghazi
So Abel is sitting there a couple months before the conference with this sudden, unexpected surge of interest and no plan.
Abel Brodeur
I have 80 people, some coming from Ireland, others coming from Sweden, others coming from France. What do I do with these people?
Mary Childs
He starts collecting papers that people could replicate. And he puts everyone into teams by their field. Health, economics, development, economics.
Abel Brodeur
The first time, I had no idea what was going on. I was super stressed.
Alexi Horowitz Ghazi
He had no idea what was going to happen, what they would find. Abel heads to Oslo and convenes the first ever replication game in October of 2022. And when he checks in on one of the first teams of replicators working on the first paper, I go talk
Abel Brodeur
to them and they're like, abel, there's a problem. Like there's tons of duplicates. I'm like, what? He's like, yeah, one of the data set. Like, there's ton of people with the same age. And then I come back later on and it's like, okay, 75% of one data set. Everybody's 62 years old, all women, all living in the same village, all doing the same thing. It's duplicates and it's a paper about the inequality. If everybody is the same, there is no inequality. And that was driving some of the mechanism.
Mary Childs
The underlying data upon which this entire paper rested had been merged improperly, like a big copy and paste error. To Abel, this was disconcerting.
Abel Brodeur
And I was like, oh boy, that's the first paper. That's the first game. What did I create? It's going to be like this all the time. People finding crazy mistakes and did I just open a can of worms that actually most papers are just like terrible. Full of crazy golden errors.
Alexi Horowitz Ghazi
Abel was a little afraid he might be about to discover that all papers were full of worms and that science wasn't real.
Abel Brodeur
But luckily by the end of the day, like many teams had like good day, everything was clean and so on. And I was like, it's like not terrible.
Mary Childs
He could relax. It turns out most of the papers were not terrible. And even better, with that first event in Oslo, Bell had found a way to crowdsource this massive academic auditing project, essentially for free. If he could host enough replication games every year, he just might be able to scare the social sciences into acting right.
Alexi Horowitz Ghazi
But what actually happens on the ground during these things? After the break, we enter the 51st replication game.
Sponsor/Advertisement Voice 2
Support for npr and the following message comes from Rippling Tired of using HR systems that feel fragmented and manual, Rippling unifies hr, IT and finance into a single platform to run your business. In as little as 90 seconds, rippling can onboard a new employee automatically setting up payroll, benefits, devices and corporate cards all from one place. It's as simple way to run your business. Head to rippling.com money and sign up today. That's RIP P L I N G.com money support for this podcast and the following message come from E Trade from Morgan Stanley Discover a wide range of investing choices and banking solutions all in one place. Plus get up to $1,500 when you open a brokerage account with a qualifying deposit today. Learn more@etrade.com offer banking products and services are provided by Morgan Stanley Bank national association member FDIC Terms and other fees apply. Investing involves risks. Morgan Stanley Smith Barney LLC member Sipic
Sponsor/Advertisement Voice
this message comes from EasyCater, the workplace food platform. EasyCater helps organizations order food from favorite restaurants, meet dietary needs and stay on budget with employee meal programs, flexible payment options and 247 customer support all on one platform. Learn more@easycater.com
Mary Childs
so we are at a replication game in real life in Montreal. A Bell broder says that the game part is a little bit of a branding exercise. There are no winners or prizes. It's more like an all day hackathon.
Alexi Horowitz Ghazi
The teams are mostly economists with a few groups of psychologists, and they've already chosen the papers. They'll focus on using just what they have in the replication package. They will have seven hours to check the code, examine the decisions their papers authors made, and see if the results reproduce. And then they'll report on whatever they find so it'll be out there on the record, whether that's a nothing burger or a bombshell.
Mary Childs
After everyone claps their rendition of we will replicate you, the researchers start streaming out of the Lecture hall, and we run after them.
Alexi Horowitz Ghazi
Jolene, could I talk to you for a sec? Hey, I'm Alexei.
Sponsor/Advertisement Voice
Oh, hi.
Alexi Horowitz Ghazi
Alexei just set the scene for me, like.
Mary Childs
So we just finished clapping a cheesy opening song, and we're about to split up into rooms. The groups are scattering into classrooms across the building to start digging into their papers. Economics PhD student Jolene Hunt and her team are looking at a paper about education. They're all education economists, and so Jolene has sort of a pedagogical view of the day. In PhDs, we often don't get a chance to actually work together. We're usually just kind of on your own in your silo. And then, like, you talk to each other when you're having problems. But it'll be nice to actually work together and see if my friends are actually any good at their jobs, rolling up their sleeves, getting down to the actual coding. Because they're only going to have seven hours. Each group has a little list of the things they've decided they're going to try to get through today. There's one group led by a guy named Thibault Dupre, who is sitting alert and ready to unpack a paper about pensions in different countries.
Abel Brodeur
Essentially, the paper focuses on 10 something countries, but then the data set seems to have a few more countries in there. So why some countries were included, others were not. What if you drop a few countries
Mary Childs
out of the data set?
Abel Brodeur
Maybe there's something to be explored there.
Alexi Horowitz Ghazi
And we wanted to understand the stakes for the day, you know, why people would attend this event to do a full day of, like, manual economic labor for no dollars.
Mary Childs
So we asked them, what are you doing here today?
Abel Brodeur
Well, we're trying to see if we're.
Mary Childs
If we can replicate the results from
Alexi Horowitz Ghazi
a paper that took a look into
Abel Brodeur
the effects of negotiations.
Mary Childs
I've started with a group in the lecture hall huddled around their laptops. Frael La Sued is a researcher at the University of Saskatchewan, and she's in a group of economists focused on agriculture with Xi Jia Wu from the University of Ottawa. You want to find that the paper checks out? Yes.
Chishiya (Researcher)
You can think like that? Yeah.
Abel Brodeur
Okay.
Mary Childs
In terms of your personal incentives, would it be cooler to find like, oh, no, this paper's messed up.
Alexi Horowitz Ghazi
Friel starts laughing, seemingly at the premise of the question.
Mary Childs
You're laughing so hard.
Chishiya (Researcher)
Why?
Alexi Horowitz Ghazi
That's mean.
Mary Childs
I don't know, like, how to answer it.
Abel Brodeur
It would be bad for Diego and Juan here.
Alexi Horowitz Ghazi
Those are the authors of the paper.
Mary Childs
Do you know them? No. You just have Sympathy for them. Yeah, because we've all been in their shoes. Okay, fair. But we go up to another group and they're kind of like, duh, yeah,
Felix Fosu
we are trying to find something.
Mary Childs
That's Felix Fosu, a postdoc at Queen's University. His group is digging into a paper about cartels in Mexico. I tell him what the other researchers said, that maybe it isn't very nice to want to find something terribly wrong in someone else's research. But it seems like to Felix, I have now misunderstood things in the opposite direction.
Felix Fosu
No, we definitely want to find something.
Chishiya (Researcher)
Why?
Felix Fosu
I think replication is something that we have to take. Very important in economics. We need to make sure that our results are indeed claiming what they claim to be. We need to know what works and what does not work.
Alexi Horowitz Ghazi
Now, regardless of their specific goals, the actual work of replication is divided into two main phases. Phase one is the same for every pure and simple replication, they will all check the paper's code, the programmed instructions that take some raw data and put it into a bunch of tables that comprise the foundations for the paper's conclusions.
Mary Childs
So now each team takes the original code, copy and pastes it, and basically hits enter to see if it runs.
Alexi Horowitz Ghazi
And one type of mistake that they might find is if the code is really broken, they might find that when they push the button, the code just doesn't run. The computer just says error.
Mary Childs
Or another kind of mistake they might find. Maybe the code runs great, but it spits out a different answer than what the authors wrote. Not so great. Or maybe the raw data is messed up in some way. Like cells merged or transposed or erased or accidentally filled down whole column.
Alexi Horowitz Ghazi
So we ask the agriculture team to show us exactly what they are doing.
Mary Childs
So I can't code. I don't know what I'm looking at. What am I looking at?
Chishiya (Researcher)
Well, actually it's kind of nothing here because I just started.
Alexi Horowitz Ghazi
This is Chishiya again. The paper her team picked by Diego and Juan Pablo is about the price of eggs at big firms versus small firms. How much pricing control they have.
Mary Childs
I look at her laptop over her shoulder.
Chishiya (Researcher)
So what you can see here is already the variables they have. We have the firms, we have the price. We day, month and year.
Mary Childs
Now Qixia pulls out her iPad to scroll through the published paper.
Chishiya (Researcher)
So we're going to firstly check whether we can perfectly reproduce all the numbers and using the original data and codes, if I can run part of this, maybe you can see.
Mary Childs
Okay, she's pushing a little blue arrow, a little play Button.
Chishiya (Researcher)
So basically, if I run this code, you'll see the results.
Mary Childs
Oh, a little box appeared in a different window.
Chishiya (Researcher)
Yes. So if you check the numbers minus
Mary Childs
18.11432, and I'm looking at the published version, it says minus 18.114 star, star, star.
Chishiya (Researcher)
So they're basically exactly the same.
Mary Childs
It's the same.
Chishiya (Researcher)
Yeah, it's the same. That's good. You know.
Mary Childs
So we have a win.
Chishiya (Researcher)
Yeah. Yes, one. And we have more to check.
Mary Childs
A lot more, but we got one. That's great.
Alexi Horowitz Ghazi
Shishia will keep plugging in all the data and checking the results, though so far it looks like the paper is checking out.
Mary Childs
And if the paper passes the whole first phase, if the code does spit out all the answers that the author said it would, then the replicators move on to phase two, robustness checks.
Chishiya (Researcher)
For robustness check. We kind of like change some parts of the model to see whether the original conclusion is still kind of makes sense.
Mary Childs
This phase is less objective and requires more context and thought. It requires the economist to consider the questions that the paper authors didn't think of or didn't write about the decisions the authors made and the decisions they could have made but didn't. It's like trying to see the negative space in and around the paper, the
Alexi Horowitz Ghazi
kind of things they might find in this phase. You know, did the authors say that this data set represents something it doesn't? Did they use an appropriate data set, and did they use that data in a way that made sense? Did they include or exclude certain specifications or factors in order to have a result that looked exciting?
Mary Childs
There are infinite potential choices that researchers make or don't make, and the replicators have such limited time, so they're not going to be able to consider and analyze everything. They're just going to get through as much as they can.
Alexi Horowitz Ghazi
And as the hours start to tick by, it becomes clear that most teams are not turning up major issues until mid afternoon. We check in with this one group looking at a paper about government policies.
Abel Brodeur
The basic premise is when people trust the government, do they tend to comply with policy more?
Alexi Horowitz Ghazi
This is Simon Prevo. He's an Econ Master's student and a public sector researcher. The paper found that when people trust in government, they comply with policies more readily. So those policies cost the government less money. And Simon and his teammates are now trying to unravel a mystery, because when they went to look at the raw data that underlies the paper's findings, it looked a little funny. This is Scott Morier, another econ master student on the team. There was a folder called Raw for the raw data, but the files were all labeled clean. So we were a bit like confused to how it was counterintuitive, right? So Florian downloaded the data straight from the source and followed the instructions to create the one data set.
Mary Childs
They recreated what should be the same data set. Following the instructions that the authors left, they ran the code.
Alexi Horowitz Ghazi
And then that's when we started getting the errors because variables were missing. And then as we kept going through, we kept finding more variables that were being used in the regression but weren't necessarily included in the. In the supposedly what is meant to be the raw data set? Some variables are missing from the raw data set. The authors seem to have used data in their analysis that they did not account for. Not good.
Mary Childs
And then we visited the group looking at that paper about cartel behavior in Mexico. That group has found something too.
Felix Fosu
So in this paper, they look at the presence of different cartels.
Alexi Horowitz Ghazi
They tell us the paper looks at 20 cartels and data about what types of crimes were happening and when to see if cartels changed the types of crime they did after the government ramped up a big war on drugs.
Felix Fosu
What we found so far is that if you exclude one of the cartels, then the results become insignificant.
Mary Childs
So it's just the one cartel making
Felix Fosu
the results with cut up, making the results.
Alexi Horowitz Ghazi
So if you remove only one, then the result collapse.
Abel Brodeur
Right.
Mary Childs
Oh, no. You found something.
Alexi Horowitz Ghazi
Yeah, they found something in the first test they tried.
Mary Childs
Is that luck? Would you call that luck?
Felix Fosu
No, I think it's something that we. We thought about it. That's why we place it one on the list. We thought it's a good place to search. So partly luck, but partly because we thought about it carefully.
Mary Childs
That sounds like not lucky. They're going to keep investigating. And depending on what they find, this paper is maybe not passing this phase, the robustness check phase. Can you draw a big sweeping conclusion about the effectiveness of a war on drugs from a change in just one cartel? They suspect this paper will not hold up.
Alexi Horowitz Ghazi
Over lunch, the cartel team starts puzzling through, like, how does this sort of thing even happen? You have to be honest for sure. When you do these kind of papers, you do these kind of things, right? You check whether, when you have these, you know, you do these type of robustness checks. David Benatia, a professor on the team, says this is a robustness check that he would have tried if he had been the author. At the end of the day, our researchers limp back into the auditorium to present what they'd all found.
Abel Brodeur
So the way we like to finish is to give each team about one minute to tell us how your day went, the different challenges you faced. Maybe we can start from the beginning, move around.
Mary Childs
We didn't find anything too major. There was a lot of missing variables and attrition.
Abel Brodeur
So we went, well, like, all the code ran, but everything ran fine. We tried to poke holes in it, but we couldn't really do it.
Alexi Horowitz Ghazi
For the 71 replicators in the Montreal game, 14, teams got to uphold science by double checking some published work. They spent a day coding with their friends and peers, learned some new coding hacks and new ways to make choices in research, and they'll get a little authorship credit on a meta paper in a real journal.
Mary Childs
The other two teams, the group who discovered the missing numbers, the cartels group, they've gotten like a toxic golden ticket. Now they'll get to write their report, polite and formal, but nonetheless kind of a bombshell, saying just how flawed the research is.
Alexi Horowitz Ghazi
Maybe that makes a splash and everyone thinks they're brilliant. Or maybe it makes a splash and everyone hates them.
Mary Childs
Next, Abel will write an email to the authors, a somewhat standardized note, saying, hey, here's who we are and what we do. We found some mistakes in your paper. Would you like to respond? He does not assume nefarious intentions, and the authors get an opportunity to try to fix the problem and prepare their formal response before anything goes public.
Alexi Horowitz Ghazi
And because Abel handles it from his position at the Institute for Replication, it doesn't feel so personal and the replicators have a little bit of insulation.
Mary Childs
We asked Felix from the cartels group what this might mean for him as a more junior person, a person earlier in his career. It's kind of throwing rocks towards the top of the profession. He'd wanted to find something and now he has.
Felix Fosu
I think it's a good work that we are doing, but what the implications are, I don't know. Yeah, I will find out.
Alexi Horowitz Ghazi
So after a few months, Abel sends his neutral, toned, official email to the authors of the paper that Felix and his team had replicated in Montreal, saying that the code had worked, but that they found the results don't hold up.
Mary Childs
And for the authors of that paper
Abel Brodeur
getting that email, when we opened that email, we were actually happy because we actually read your paper replicates.
Mary Childs
This is Giacomo Battiston, a researcher at the Rockwell foundation in Berlin and one of the four co authors of the paper. He says they were thrilled to have their coding results publicly validated.
Alexi Horowitz Ghazi
And when it came to the bigger problem, the fact that their results had fallen apart when their replicators removed that
Abel Brodeur
one cartel, we were not particularly worried about the content because it was kind of self evident that this was not
Mary Childs
really challenging, not really challenging their findings because they think the replicators misunderstood the basic hypothesis of their study. They say they started with this, this idea that there was this one big new cartel in Mexico, Los Sedas, and it had been doing a lot of crimes, generating a lot of data points. Here's another author, Marco Lemolier, a researcher at Bocconi University in Milan.
Alexi Horowitz Ghazi
When we start to think about this project, actually had in mind the specific character of Lozetas, they say they set out to investigate if the cartel Los Zetas had changed the types of crimes they did after the war on drugs, and their papers succeeded at proving that.
Mary Childs
What the Montreal replicators did, in the opinion of the paper authors, was to remove the main part of the data set and then say the conclusion was broken. You can do that, but why would you?
Felix Fosu
To be blunt, it doesn't make any sense.
Alexi Horowitz Ghazi
That is Paolo Pinotti, a professor also at Bocconi University. He said it was like doing a study on the effect of spreadsheets on productivity and then saying, oh, but the results don't hold up if you exclude Microsoft Excel.
Mary Childs
We looked at their paper and to be fair to the replicators, the original paper does not say explicitly, hey, it's just LOS Eras we're focusing on. The data from Los Eras is lumped in with several other new cartels. So if the paper authors meant to study the behavior of just Lositas, that was never quite spelled out.
Alexi Horowitz Ghazi
Mary, when we first rocked up to the replication games back in May, I think we were both excited at the idea that we might watch some junior economists uncover some major problem with a published paper in real time. But Abel had a different take when we asked him about the problems that the teams there had uncovered, like the team, for example, that had found issues in the government trust paper. That seems like success.
Abel Brodeur
Success? It depends what you define as success.
Alexi Horowitz Ghazi
Well, the process working as it's supposed to.
Abel Brodeur
I mean, in a world in which science works, I think this should have been picked up before it's published, cited and disseminated. So I don't think it's a success.
Alexi Horowitz Ghazi
That's fair.
Mary Childs
These papers they're replicating have been published, meaning they got past journal referees, professional economists who are supposed to be gatekeeping the quality of what they publish. Some of the top journals do check that the code runs they press play. But in the government trust case, the journal referees apparently didn't catch that numbers were missing, that when the paper said, oh, the documentation is in the replication package, it was pointing to nothing.
Alexi Horowitz Ghazi
The Journal declined to comment, though they said they have a robust process to investigate concerns.
Abel Brodeur
To me, this is a failure of the system, which is fine. There's always going to be failures. I just think that the rate of failures is higher than what a lot of people think and it shouldn't happen
Alexi Horowitz Ghazi
that often in every replication game so far they have found something, though not yet any career ending fraud. It's more like major data or coding errors or robustness fails.
Mary Childs
So the broader system is still broken, even after putting on more than 50 games and replicating about 300 papers.
Alexi Horowitz Ghazi
Still, there are signs that the games are having an effect. Several replication gamers told us their experience here will change how they do their research because they know that their papers too might someday end up under Abel's spotlight.
Mary Childs
Abel says the more games he can put on, the more the rest of the academic world will start to shift because the evidence shows that people don't actually change their behavior based on the severity of the potential punishment, like losing their job or public shaming or whatever. They change behavior based on the odds of enforcement, the odds of actually getting caught. Just the idea that someone might walk through their apartment one day, that's enough of a threat to keep it clean. Hey listeners, what are you doing on the Evening of Monday, April 6th? Are you free? Because if you are, I think you should come to the 92nd Street Y to hang out with me and some of my friends. It is the first debut stop on our 12 city book tour to celebrate the publication of our first ever book, Planet Money, a guide to the economic forces that shape your life. Every stop on this tour will be unique with different hosts and guests and if you get a ticket, you can get a tour exclusive tote bag with your purchase while supplies last. So at the 92nd Street Y on Monday, April 6, it'll be me, Amanda Aronjik, Darine woods, book author Alex Moyasi, and the economist Emily Oster, who is most famous, I think, for letting pregnant women know that they can actually drink coffee. So please come and bring your very best economic questions for us. We can't wait to hang out. Find the show nearest you at the link in Show Notes or go to planetmoneybook.com and thank you.
Alexi Horowitz Ghazi
If you want to hear more about the replication crisis, we've done a few episodes about it and the efforts to fix it. We'll link to those in the show notes.
Mary Childs
If you want to support our work, you can donate@npr.org donate and thank you.
Alexi Horowitz Ghazi
This episode was produced by Emma Peaslee and James Sneed, with help from Willa Rubin. It was edited by Jess Jiang, Fact Checked by Sam Yellowhorse Kessler and engineered by Ko Takasugi Chernowin. Alex Goldmark is our executive producer. I'm Alexi Horowitz Ghazi.
Mary Childs
And I'm Mary Childs. This is npr. Thanks for listening.
Sponsor/Advertisement Voice
This message comes from Capella University. That spark you feel? That's your drive.
Alexi Horowitz Ghazi
For more.
Sponsor/Advertisement Voice
Capella University's flexpath learning format lets you earn your degree at your pace without putting life on pause. Learn more@capella.edu.
Sponsor/Advertisement Voice 2
this message comes from Capital One. Capital One offers checking accounts with no fees or minimums. What's in your wallet? Terms apply. See capitalone.combank for details. Capital One NA Member FDIC this message
Sponsor/Advertisement Voice
comes from Capital One. With the Venture X card, earn unlimited double miles, a $300 annual capital one travel credit and access to airport lounges. Capital One what's in your wallet? Terms apply. Details@capital1.com.
Date: February 27, 2026
Hosts: Mary Childs, Alexi Horowitz Ghazi
Theme: Investigating the Replication Crisis in Social Science through the “Replication Games”
This Planet Money episode delves into the heart of the “replication crisis” — the troubling pattern where many scientific studies cannot be reproduced when their experiments or code are scrutinized. Mary and Alexi travel to Montreal to join economist Abel Brodeur at one of his “Replication Games,” a unique event rallying social scientists to audit and replicate published research papers. The episode explores how and why these problems persist, the efforts to confront them, and the potential of crowdsourced quality control to restore faith in published science.
Introduction to Abel Brodeur (00:56–01:16):
Why Replication is Necessary (01:47–02:13):
Abel’s Personal Experience (05:56–08:40):
Publishing Incentives and P-Hacking (08:45–09:40):
Empirical Evidence & Journal Resistance (09:40–10:54):
Making Replication Official (13:34–14:12):
Crowdsourcing Replication: The Games Begin (15:04–18:14):
How a Replication Game Unfolds (19:54–20:30):
Fieldwork Vignettes & Participant Perspectives
Replication in Practice (24:00–27:21):
Presenting Findings (30:45–31:14):
Handling Disagreement and Backlash
Abel’s Perspective on Success and Failure
Systemic Takeaway
In “Don’t Hate the Replicator, Hate the Game,” Planet Money chronicles how academic incentives and lax scrutiny have led to the replication crisis, with stories from inside the Replication Games—a hopeful, gamified approach to solving a systemic problem. Host Mary Childs and Alexi Horowitz Ghazi capture the cautious optimism and the very real tensions as social scientists confront uncomfortable truths about their field. Abel Brodeur’s initiative brings transparency, a bit of fun, and the subtle pressure academics need to be more rigorous, showing that while science may never be perfectly clean, it’s a lot cleaner when everyone knows that anyone, anywhere, could walk in and check for dust.