
Uri Simonsohn is a behavioral science professor who wants to improve standards in his field — so he’s made a sideline of investigating fraudulent academic research. He tells Steve Levitt, who's spent plenty of time rooting out cheaters in other fields, how he does it.
Loading summary
Advertiser
This episode is sponsored by Nordstrom. The Nordstrom Anniversary Sale is on now. It's a big deal with new arrivals on sale for a limited time. Save up to 33% on brands like Ugg, All Saints, Charlotte Tilbury, Steve Madden, Bobby Brown and more. Plus, stock up on once a year beauty exclusives from winning brands. The best deals go fast in stores and@nordstrom.com prices go up.
August 4th it's over for Dirty Toilet Brush Bristles Because Clorox Toilet Wand makes cleaning your toilet so satisfying that it might become your favorite chore, the all in one toilet cleaning system is so surprisingly easy, it's no wonder they call it a wand. This thing cleans like magic. The Clorox Toilet Wand comes with six scrubbing pads preloaded with disinfecting toilet cleaner. Just click, swish and toss for a surprisingly easy clean. It's time to simplify house cleaning. Visit Amazon to purchase your Clorox Toilet Wand today. Users Direct.
Yuri Simonson
As an academic, I did.
Steve Levitt
A lot of research trying to catch bad actors. Everyone from cheating teachers to terrorists to sumo wrestlers who were throwing matches. What I didn't do much of, though, was to try to catch cheating academics.
My guest today, Yuri Simonson, is a.
Behavioral science professor who's transforming psychology by identifying shoddy and fraudulent research. Watching his exploits makes me wish I'd spent more time doing the same.
Yuri Simonson
We talk about red flags versus Smoking guns. So red flags. That gives you, like, probable cause, so to speak. But it's not enough to raise up an accusation of fraud. That's a smoking gun.
Steve Levitt
Welcome to People I Mostly Admire with Steve Levitt.
Yuri Simonson, and two other academics, Joe Simmons and Leif Nelson, run a blog called datacollada, where they debunk fraud, call out cheaters, and identify misleading research practices. Yuri, on his own, has been doing.
Yuri Simonson
This work for over a decade.
Steve Levitt
My Freakonomics friend and co author Stephen Dubner spoke to the Data Colada team for a series about academic fraud that ran on Freakonomics radio in 2024. But I admire Yuri and his collaborators so much that I wanted the chance to talk to him myself. I started our conversation by asking about the research study that got him started in this direction. The study that he read and said to himself, my God, this is outrageous. I just can't take it anymore.
Yuri Simonson
The first time I ever did any sort of commentary or criticism, I was asked to review a paper for a journal. What they were studying was the impact of your name on your big life decisions. The paper began with something like it's been shown that people with the same initial are more likely to marry each other than expected by chance. In this paper, we check whether they have similar divorce rates or something like that. The idea was, oh, maybe if you marry somebody just because they have your same initial, you're less compatible than if you follow more traditional motivation for marriage. And I thought, no way. And so I stopped reviewing and I went to the original paper. And it's not that I thought, there's no way that your name impacts these decisions. I could imagine some mechanisms where, like, you're talking to somebody and then they happen to share a name with you. That can be a nice icebreaker, and it would lead to our relationship. But what I thought is, how in the world do you study this credibly? And so that led me to an obsession. I went to the regional and I thought, okay, clearly it's gonna be some ethnic confound, for example, cause different ethnicities have different distributions of last names, right? So very few South American people have a last name starting with a W, but many Asian people do. And so because Asians marry Asians and South Americans marry South Americans, that explain it. But it was better than that, the original study. So it took me a while, and then I figured out what the problem was. The very first one I checked was with same initial last name. So the idea is that if your last name starts with an S, you're more likely to marry somebody else whose name starts with an S. And what I found was that the effect was driven entirely by people who have the exact same last name. So I thought, why would people have the same last name and be more likely to marry? Like, how would that happen? And a common mechanism is that there's a couple, they get married, she changes her last name to his, they divorce, and then they remarry each other. Now, this is rare. This is rare. But because you expect so few people to marry somebody else with the same last name, it's such a huge coincidence that even a small share of these people, they can generate an average effect that's sizable.
Oh, yeah, that's great. That's clever. So later, you and Joe Simmons and Leif Nelson published a paper in 2011 called False Positive Psychology. And it turned out to be an incredibly influential paper. You and your co authors highlight specific practices commonly done by researchers that can lead to drawing the wrong conclusions from a randomized experiment. The core of the paper is really made up of simulations that show quantitatively how various researcher choices lead to exaggerated levels of statistical significance. So it appears in a published paper that a hypothesis is true even when it really isn't.
Steve Levitt
What was the motivation behind writing that paper?
Yuri Simonson
Me, Leif and Joe, we were going to conferences and we were not believing what we were seeing, and we were sticking with our priors. And so then what's the point? You should read a paper that surprises you and you should update. It doesn't mean you should believe with certainty it's true, but you should update. You should be more likely to believe it's true than you did before reading the paper. And we were not experiencing that.
It is one of the few academic papers that has caused me to actually laugh out loud. Because as part of that paper, you describe in a very serious way, an actual randomized experiment you yourself ran in which you find that listening to the song When I'm 64 by the Beatles actually causes time to reverse.
Will you still need me? Will you still feed me when I'm 64?
People who listen to that song, they're almost 1.5 years younger after they listen to the song than before. Obviously that makes no sense at all, which is the whole point. But you report the results in the same scientific tone that pervades all academic papers. And I found it to be hysterically funny. So let me start by giving my best attempt to describe the textbook version of a randomized experiment. That's the gold standard of scientific inquiry. So here's my attempt. The researcher starts by posing a clear hypothesis that he or she wants to test. So in Your When I'm 64 paper, this hypothesis would be that listening to that song causes time to run in reverse, leaving people who listen to it younger after they listen to it than before. And then the researcher poses a second alternative hypothesis called the null hypothesis, to which that first hypothesis is compared. In this case, the null hypothesis would probably be that listening to when I'm 64 does not cause time to reverse.
Right.
Then the researcher maps out a detailed experimental protocol to test these two competing hypotheses. And then using very simple high school level statistics, you determine whether there are any statistically different changes in ages across the subjects in the two groups. And if I ran that experiment as described, you would be inclined to. To believe my results, whatever they were?
That's right.
Okay, so let's talk now about specifically how you used standard methods employed in the psychology literature at the time to prove that this Beatles song reverses time. The first common practice you talk about is measuring the outcome you care about in multiple ways, but only reporting results for the outcome variable. That yields the best results.
So all you need to do is give yourself enough chances to get lucky. We can think of P values that you have something insignificant. There's like a 1 in 20th chance that you can get lucky.
Right? So the P value refers to the probability value or the likelihood of observing the effect. You're studying in academic publishing. For reasons I don't really fully understand, we've anointed the 5% level of statistical significance as some kind of magic number. Right? So if your story's not really true, you'd only get the data that looked like this less than 5% of the time, then that somehow magically leads people to say that your theory is true. If it's above 5%, then we tend to say, oh, you haven't proven yet that your theory is true.
So like suppose you have a friend who says I can forecast basketball games and they get it right for the one game. You're like, well it was 50, 50 that you get a right, so I'm not impressed. So then they get two games in a row, like, oh, okay, that's more surprising. There's only a 25% chance that if you were just tossing a coin, you would get two basketball game guesses correctly. But you're still not sufficiently convince. But when they do five in a row, then you think, oh, maybe they can actually forecast basketball games because it would be so unlikely that just by chance you'll get five in a row. That I guess the alternative becomes my candidate theory. So I guess you can predict basketball games. That's the logic of it. Like at some point you're forced to get rid of just chance as the explanation.
Okay, so in this when I'm 64 experiment, what you're saying is it was like a friend asked you to predict the outcome of five basketball games in a row, but what you secretly did in the background is you actually predicted the outcome of five basketball games. Not once, but maybe a hundred times. You had a hundred different series of five basketball games, and there was one of them out of those hundred that actually gave you this crazy result. And then you reported that's what you got.
Yeah. Like imagine your friend said, okay, I want to predict five basketball games, but they're also predicting five baseball games and five football games and five whatever games. And then whichever one worked is the one they tell you about. And the ones that didn't work, they don't tell you about. So if you have enough dice, even if it's a 20 sided die, like only 1 in 20 chance if you keep rolling that die, eventually it's going to work out. And so the way academics, our researchers in general can throw multiple die at the same time and be able to get the significance they're looking for is they can run different analysis, which is what we did there. So we were comparing participants agents across the two groups. So that's one die, but we actually had three groups. So we had when I'm 64, we had a control song, which is a song that comes with windows as a default in the background. And we also had the Hot potato song, which is a kids song. And so that could have had the opposite effect. So we had three. We could have compared hot potato to control, or we could have compared hot potato to when I'm 64 or when I'm 64 to control. So right there we have three dice that we were throwing. But we also could adjust our results. So the one we ended up publishing was controlling for how old the participants fathers were. Okay? And the logic was, look, there's a lot of natural variation in how old people are. And so to be able to more easily detect differences induced by our manipulation, we want to control some of that noise, right? To take it into account. And so one way to take people's age into account indirectly is to ask how old their parents are. And so in a regression, statistically, we take that into account. And when we did that, the effect arose. And why? Because if you do that now we have three more diced, right? We can do controlling for father's age, hot potato versus control, controlling for father's age, hot potato versus when I'm 64, and so on. And so in the end, we had like many, many different ways we could cut the data. And the one that worked is the one we ended up writing about.
The way you just described is completely and totally obvious that you're cheating. If you test a bunch of outcomes and then you just choose not to tell people about the ones that don't work, and you focus all your attention on the one that does work, you're obviously misleading people into believing your hypothesis is more powerful than it really is. So how could the academics not realize this was bad science? Do you think they really didn't understand that this was cheating?
I do. I do, because I had many conversations where people were pushing back and saying there's nothing wrong with it. I think there's two ingredients to it. One of it is just not knowing the statistics. Most people who take statistics don't learn statistics for some reason. It's profoundly counterintuitive to humans. It's just not how we think. And the other reason is we're very good storytellers. And so what happens is the moment you know what works, you immediately have a story for why it makes sense. I remember the first time I presented the 1M64 study, somebody in the audience asked a question jokingly about some decision we had made. My instinct was to immediately defend it. Like, we are just so trained. That's what you do. So I don't think people were cheating in the sense that they thought it was wrong. They just didn't know and they didn't quite appreciate the consequences. I just want to say it's not just psychology. This is very common in clinical research. If somebody's running an experiment, it can be in medicine and economics, in biology. Like, at the time, I was talking to scientists from all fields, and this is a very widespread problem.
Okay, so that's good to point out, because one could easily say, well, I'm not that worried if psychologists start messing around. But when medical researchers are messing around, now you're actually getting into things people really care about. Okay, let's talk about the second misleading research practice that you highlight. And this one's a lot more subtle than the one we just talked about. The researcher designs an experiment and carries it out, and then he or she looks at the data and sees that the results, oh, they're not quite statistically significant. Everything's going in the right direction, but it didn't quite reach this magic.05 threshold. So it seems sensible in that situation, you say, well, look, I just didn't have enough power. That's what we call it in experimental design when you don't have enough research subjects to actually show that your true hypothesis is really true, you don't have enough power. And so maybe I'll just go and add another 15 or 20 observations and I'll see if it's significant. Oh, and maybe again, it wasn't quite significant. I had 20 more. Boom, I'm over the threshold, and then I stop. Now, intuitively, this doesn't seem nearly as bad to me as not reporting all the outcome variables. But as you show, my intuition is wrong. This is actually a really. Can you try to explain why in a way people can understand?
Yeah, let's think of a sport. Let's say tennis, and let's say you're playing tennis with somebody who's similarly skilled as you are. And so beforehand, if you had to guess who's going to win, it's like a 50. 50 chance. But suppose we change the rules and Steve gets to say when the game ends, we don't play to three sets, we play to whenever you want to end it. Okay. And you're one of the two players.
Okay.
You may see that now you're much more likely to win the match.
Yeah, if I win the first point. Match over.
That's right. So if at any point during the game you are ahead, you win the game. And therefore now the probability is not that you win after three sets, but it's that you're ahead at any point in the game. And that necessarily has to be much more likely. And so similarly with an experiment, when we do the stats, what the math is doing in the background is saying, well, if you committed to 60, how likely is it that after 60 you will have an effect this strong? But what we should be asking is what is the likelihood that at any point up to 60 your hypothesis will be ahead by a lot? That's the question we should be asking. And that's necessarily much more likely.
Yeah. And the key is that you stop when you win and you keep on going when you're losing.
That's what introduces the bias. It's not a random decision. If you were to flip a coin, do I keep collecting subjects or not? Then there will be no problem. The problem is the way you said it, like, if you're losing, you keep playing, but if you're winning, then you end.
It's unlike the first point where the academics I would talk to about having multiple outcomes. They totally got why that wasn't legit. But I still can have conversations with experimentalists who will argue with me about this point and say, I'm dead wrong. How can I not understand this? This is a good research practice, not a bad research practice. But as you show in the paper and as the intuition you just described explains, it's really a bad practice.
We're not good intuitive thinkers about statistics, especially about conditional probability, which is. It has that flavor. And that's the source of the problem.
Steve Levitt
We'll be right back with more of.
My conversation with behavioral scientist Yuri Simonson after this short break. People I most admire is sponsored by netsuite. It's an interesting time for business. Tariff and trade policies are dynamic, supply chains squeezed and cash flow tighter than ever. If your business can't adapt in real time, you're in a world of hurt. You need total visibility, from global shipments to tariff impacts to real time cash flow. That's NetSuite by Oracle, your AI powered business management suite Trusted by over 42,000 businesses, NetSuite is the number one cloud ERP, bringing accounting, financial management, inventory and HR into one suite. You have one source of truth giving you the control you need to make quick decisions. With real time forecasting, you're peering into the future with actionable data. And with AI embedded throughout, you can automate everyday tasks, letting your teams stay strategic. NetSuite helps you know what's stuck, what it's costing you, and how to pivot fast. If your revenues are at least in the seven figures, download the free ebook Navigating Global Trade. Three insights for leaders@netuite.com Admire that's netsuite.com Admire.
Yuri Simonson
What I find so beautiful about this paper is that it is really so simple. It's so easy to understand. It's so obvious in a way, and yet a whole field of academics was totally blind to it until you pointed it out. And at least the way I've heard the story told, you and your co authors didn't think you'd even be able to publish the paper, much less imagine that it would emerge as one of the most important papers published in psychology in the last two decades.
We thought it was uncitable because we thought, how can you cite this paper in what context? Like you would say, well, we didn't do this weird thing that they talked about citation. So we thought, okay, maybe it'd be influential, it'd be hard to publish and it would be uncitable. And it's incredibly cited like crazy. We were super wrong.
The bottom line, which is really stunning, I think, to most people who read your paper, is in your simulations, what you find is that when you run through a hypothesis that is not true by design, you've built these hypotheses not to be true, but you do all of these different tweaks together, then over 60% of the time you get statistically significant results for your hypothesis. Okay, these are 100% false hypotheses that 60% of the time lead to the truth. That's crazy. It surprised me. Did it surprise you how big that number was when you first ran the simulations?
We were floored. That's when we decided we definitely have to do the paper. Yeah.
So I've had a lot of psychologists as guests on this show, and they have reported some truly remarkable findings. And I suspect I should have been more skeptical of them than I was. But it's also odd to only believe research that confirms your beliefs. It's a hard line to follow. I guess. It's why we so desperately need credibility in research is that when research is not credible, then you just default to your own intuition. But if you're just defaulting to your own intuition, you go back to Socrates and Aristotle, you're no longer empirically driven.
One of the things we've been doing is advocating for pre registration, which means people tell you how they will analyze the data before they actually run the study. So closer to the way you were describing the ideal experiment, and there have been substantive uptake of this idea of pre registration. So when you see the results, you can evaluate the evidence much closer to face value of what the statistics tell you.
So in that initial paper, you laid out a simple set of rules for how to create a body of research that's more credible. And one of them is just pre registration. Another really simple one is making your raw data available. I think this will amaze people who are outside of academics. But until recently, until after what you did, and in large part probably because of what you did, academics were not expected to let others see their raw data. And that has really been transformational, I think, don't you?
Yes, that's very important. It's easier to check for errors, it's easier to check for fraud if one is so inclined. And it's easier to even just look for robustness, like the idea that, oh yeah, you get it with this particular model, but let me try something else. The way I usually analyze the data do I also get it that way? So that's become much more common. I wouldn't take too much credit for that. The Internet is probably a big source of why it's just easier to upload and share than it used to be.
I'm going to say something controversial now. I have the sense that part of the reason that researchers 10 or 15 years ago were behaving so unscientifically and still researchers are pretty unscientific, is that at some fundamental level, nobody really cares whether the results are true or not. I get the sense that most social scientists see academic publishing as a game of sorts. The goal is to get publications and citations in tenure, and there's an enormous number of academic papers written each year and a nearly infinite supply of academic journals. So in the end, very low quality stuff ends up getting published somewhere. And except for a handful of papers, there's little or no impact on the broader world that comes out of this research. So it just isn't so important whether results are right. But when I've bounced that hypothesis off of other academics, they used to get really mad at me when I say that. Although I do believe there's a lot of truth to it. What do you think?
I think there's truth to it. There's definitely people who don't care whether it's true or not, because what they're doing is maybe game is too strong a term, but their job is to publish papers. Their job is not to find new truths that other people can work with at the same time. Because in our blog we've done 130 or so and we focus on at least some criticisms of papers. And we have a policy where we engage with the authors before we make anything public. We send them a draft and we ask them for feedback. Nobody likes it.
Nobody wants to hear from you. That's a disaster.
So excited to learn what the shortcomings of this paper were like. Nobody's in that mood. But beyond that, they seem to really care now. It could be they just don't want to be shown to have been wrong and there's some truth to that, but they do seem to really care about the phenomenon. I agree with you that there's a lot of people that don't care, But I think the higher up you go on the ladder of more influential researchers in more influential journals, I think they do care. In fact, if anything, I think they have an inflated sense of how important their work is, not the opposite. They think their work is really changing the world. They don't think of it as a game. They think their life is so important because they're really changing things. And part of my sort of motivation is I don't agree with that. I agree with you in the sense that I think most research is insufficiently valuable. Even most top research is insufficiently valuable. And maybe this is too naive of me, but I think if you make it harder to publish silly things and to publish, like, sexy findings, the only hope then is to study something that's important, even if it's not intrinsically interesting, it's going to be more important. And so to move social science to be more of a force for greater welfare in society, I don't think we're there. I don't think social science is all that useful at the moment, but I do think it has the potential.
So we've been talking so far about how standard methodological practices can lead readers to falsely conclude that a hypothesis is true because the way things are represented are misleading. Okay, that only gets you so far. In the abstract. What we really need is an actual tool that one can apply to a specific body of research to reliably judge whether the findings are credible. And damn you and Joe and Leif, you came up with that too. It's called the P curve, and it is simultaneously, again, incredibly brilliant and incredibly simple. Can you give the intuition that underlies the P curve?
Yep. So what it looks like is any one study, it's gonna have a P value. It's gonna tell you how surprising the findings are. And remember, we're saying if something's 0.05, its significance, we're drawing this arbitrary line at 0.05, right? And so if you see one study ending at 0.04, that's okay, that's significant. If you see one study at 0.06, that's not significant. But the key insight is, and it's related to stuff I had done with motivation and goals, is that if you're aiming for a goal, you're going to end pretty close to it. So, for example, if you're aiming to run a marathon in four hours, you're not going to run it in three and a half hours, you're going to run it at 3:58, 3, 59, because that's your goal. The moment it's achievable, you stop going. And so the basic idea of P curve is if people are trying multiple things to get to 0.05, they're not going to go all the way to 0.01 or to 0.001. They're going to stop once they achieve it. So you start and your p value is 0.12. You know, you need a 0.05. And so you try something you control for father's age, right? And then that gets you to 0.08. And then you drop the hot potato condition and you end up at 0.04. And then you stop. You don't keep going because you've achieved your goal. So if I see a bunch of results and they're all 0.04s, then it becomes more likely that you are P hacking. A lot of academics across all fields now use this term, P hacking, which is about how you selectively report from all the analysis you did. If there's a true effect that is significant, you expect very few 0.04s and you expect mostly 0.01s. If you read the literature, you don't see a lot of 0.01s. You see a lot of 0.04s and 0.03s, right? Which tells you something. So if you give me 20 studies and I look at the P values and I see 0403-0403, I should not believe those studies. And if I see 0.010102, I should believe them. And so P curve just formalizes that. So it takes all the P values, it does a magic sauce and it tells you the combination of results that are close to 0.05 compared to far from 0.05 tells you you should believe it, you should not believe it, or you need more data to know.
Okay, so given this idea of the P curve in practice, have you been able to debunk whole literatures by using this concept and show that things that people believed and where lots of papers were published are probably not true at all?
I've only done one of that flavor and it was controversial. I think now it's not so controversial. It was the power posing literature. It was a very influential TED talk. People knew that if you assume a powerful pose, meaning you expand your body, like imagine somebody raising both of their arms and standing up, then they become more confident, more successful, they would take more risk and things like that. And we applied P curve to that literature and we found that there was no evidence for it. Maybe 30 published papers on it. And if you look at them, they provided no evidence.
So we've been talking so far mostly about mistakes possibly well intentioned people make that are doing research that seem to have truth in the back of their mind, if not actually taking the steps that are getting them to the truth.
But.
But where I think what you've done that gets a lot more fun and exciting is going after complete frauds, researchers who are actually outright making up data or faking experiments. So of all these fraud cases that you've been part of, the granddaddy of them all is the Francesca Gino case. Do you want to tell me about that?
Sure. A few years ago, maybe five years ago, four years ago, we were approached by young researchers who wanted to remain anonymous with concerns. I had been in touch with them a few times about fraud. The first paper I had done, detecting fraud, was about 14 years ago. And so I had sort of moved on and I didn't want to do it anymore. And so this person would approach me and I told them, look, unless the person is very famous or the paper is very influential, I don't want to get involved. It's very draining.
Okay, wait, so you had done some fraud research and then you swore off it? To an outsider it might seem like, wow, of all the things that might be really fun and exciting. As an academic, it would be revealing some horrible fraud who's, like, doing terrible things and ruining the profession. But you're saying you didn't actually enjoy that kind of work?
No, I hated it. I hated it. Because the first two days are fun and the next year are dreadful.
Because the first two days are the discovery process where you're actually in the data, you uncover the patterns, and then the rest is the drudgery of being 99.999% sure. Because if you're wrong, you were really in trouble.
It's not enough to be right in your head with 100% certainty. You need to be certain that others will be certain.
And it's also a case where you have an adversary. Right. Mostly when you do academics, you write something and nobody really cares that much. But when you are saying someone else made up their data, you have created an enemy who will fight you to the death.
But it's also draining because you become like the world expert in an incredibly trivial, small piece of information. Like as. We will talk in a minute. Francesca Gino, we spent a lot of time on her Excel spreadsheets for data. I'm like the world expert in study three that she ran, you know, 14 years ago. I know every row, and that's just really useless. We've talked about, like, is research useful or not useful? It's debatable, but, like, knowing how a particular spreadsheet was generated, it just feels so local. You're not learning anything. It's not fun. You get a lot of pushback.
You had done some research, you'd found fraudsters. And in part because you had a reputation for doing this, people would come to you with hot tips, with the idea that fraud was going on. So a stranger came to you with the belief that Francesca Gino was cheating on her research. And this is especially interesting because Francesca Gino is one of your own co authors in the past, which must have put you in a really interesting and complicated place.
Yep. So we have a paper together. I used to be at Wharton, I'm in Spain now. And we made her an offer when I was there. She ended up going to Harvard instead. So I knew her and we did have suspicions maybe 10 years ago. And we looked at the data that we had access to. We were subjectively convinced that something was wrong with it, but we didn't think we could prove it beyond reasonable doubt, and so we dropped it.
So then this young academic came to you, and she had better evidence. She convinced you that you thought you could actually make a case of it.
Yeah, it was two of them. And they sent me a report they had written and I thought, this is promising. We talk about red flags versus smoking guns. So red flags is something. Your experience with it, it just doesn't seem right. That gives you probable cause, so to speak. But it's not enough to raise an accusation of fraud that's a smoking gun. I think that's where the report was at that stage. And then we said, can we get evidence that is sufficiently compelling that anybody looking at it would be immediately convinced?
Your data Colada blog team is you and Joe and Leif. How many hours do you think the three of you put together on top of what these other two folks had done to try to push it to that stage?
Hundreds.
Yeah. So big, big investment. And it's not even so obvious why you're doing it. Probably did in the end further your academic career and bolster reputation. But mostly this is a task that that isn't rewarded in academics very much.
It's funny, like, it definitely helped my policymaking career. I'm actively engaged trying to change things in social science. This definitely made it easier to do that. I don't know that it made easier publishing my research that it's not about fraud because people are happy that whistleblowers exist, but nobody likes whistleblowers. It doesn't engender, like, warmth. You know, I'm happy you exist, but I'm going to talk to somebody else during my lunch. But for my intentions to try to influence science and to have more credible research, this has been very good. We've received funding. We're about to launch a new platform that seeks to prevent fraud instead of detecting it. And that only was made possible by the attention that this case received.
Okay, so just to foreshadow what's going to happen, so she is going to be fired from Harvard, her tenure removed as a result of this evidence you're collecting. So just to give give listeners a flavor, what's the clearest evidence of all the things that you found in her data? What to you said, my God, no way this could happen anyway. Except for outright fraud.
My favorite is big picture. People were rating how much they like an event that they participated in. 1 to 7. How much did you like it? Okay. And she needed people to like it less in one condition than the other. And in fact, that's what happened. And we proposed that the numbers had been changed. Somebody had said they like it a 7, but in the data they appear as a 1. This was a red flag that they came to us with the junior researchers because they were looking at the distribution of liking numbers, and they said, look, there's just this whole mass of people who are giving all sevens, and they entirely disappear. And that's true. It's surprising that you would move a bunch of sevens to ones and to twos. But what do I know? This is a weird task. I don't know those people. I don't know how people use this scale. So it was a red flag. It wasn't a smoking gun. And so what I told him was, look, if the numbers were changed, there may be a trail in the data in the other columns. In the columns that weren't changed, there should be a mismatch. And so what I was thinking when I said that was something like gender. So imagine that in general, women like this thing more than men, but those people that were changed, you don't see that effect for women or something like that. That's what I was expecting. But we found, like, a goldmine, which was there was a column where people had been asked to describe in words what the event was like, okay? And so people used words like, that was great. I loved that. Best time of my life. Or I hated. I felt yuck. Because it was a networking event. I felt disgusting selling myself in this event. And so the idea was, oh, so maybe if the numbers were changed, there'll be a disconnect between those words describing the very same event and the numbers summarizing the event, okay? And so we looked at those suspicious numbers that we would have expected to be all ones and were all sevens. Sevens, meaning they hated it. And you look at the words, and the words said, best thing ever. Loved up. Okay? And then you looked at the other side, the people who gave one some we thought were sevens, and they said, I feel disgusted. Our hypothesis was those values were changed. So what we tried to do, and this is why we reached out to Harvard originally, was like, look, if you go to the Qualtrics, which is the platform where these studies are run, so the original data on the server, we told them, if you go to row 100 and you go to this column with the numbers, you will see that even though the data she posted has a 1 on the server, the numbers actually are 7. Here are the 20 rows. You have to check if we are right, those numbers are sevens. If we are wrong, those numbers are ones. And we thought, Havra can check it immediately because they have access to the Qualtrics data. We didn't and we thought maybe the following day we would know whether we were right or wrong. Because once you identify exactly how the data were modified and you have access to the original data, then you can check whether your hypothesis is correct or incorrect.
In the end, all of the original data which hadn't been available becomes available. And a third party firm was hired to analyze the original data and to compare it to the altered data. And this third party confirmed the conjectures you had made, and they also found other ways that she had altered the data that you hadn't found.
Yeah. $25 million lawsuit. Later, information was made public. We realized, yeah, that was right.
Steve Levitt
You're listening to people I mostly admire. I'm Steve Levitt, and after this short break, I'll return with Yuri Simonson to talk about the $25 million lawsuit. In 2023, after accusing Francesca Gino of fraud, Yuri Simonson, his Data Colada colleagues and Harvard University were sued by Gino for 25 million. I too have been frivolously sued by a disgruntled academic. And even when you know the facts are on your side, it still eats up a ton of time, energy and money. I'm curious how Yuri felt while this legal threat was hanging over him.
Yuri Simonson
It was hard for a few weeks. I would say it was hard in part because funding wise, like, even if you're right, just defending yourself in the American legal system, it's very expensive.
And I heard that your university wasn't willing to pay for your defense, which I find infuriating. Is that really true?
No, no. They did end up paying for it. They did. The most generous one was my own school in Spain. But it was difficult because this is unheard of in Spain. And it's also August. In August, nobody's working in Spain. Like literally the university is closed down. And I don't know who to call when you get sued. No idea who that person is. So I find out the name of the person, I email them and they say, is this really important? And I said, well, yes, we need to talk as soon as possible. But they were great. They were actually very generous. So we did what's called a motion to dismiss, which is we tell the judge, this is ridiculous, and if the judge agrees, it's over. And that costs about $100,000. So the university said, we'll pay that now if the judge disagrees and said, let's take it further, let's go into what's called discovery, where both parties get each other's emails and documents and so on. That could be a few other hundred thousand dollars. And none of the schools committed to funding us up to that point. So that was stressful because we could be on the hook for like a million dollars. It's not like it's up to a million dollars for something you did wrong or you made a mistake or you had an accident. It's like you needed something and you have that liability. But then they did a GoFundMe for us.
So when the academic community heard about this, the GoFundMe project was started. It raised hundreds and hundreds of thousands of dollars in almost no time at all. And that made me feel really good because it's a signal of how valuable the profession thinks you're working. I must have made you feel really good, right?
Yeah, it's the only time I've cried for professional reasons. It was an overwhelming feeling because you feel like it's you against the world and feeling like the community was supportive. That really was amazing for us.
The judge has thrown out all of her claims against you, although the lawsuit.
Steve Levitt
With Harvard is still ongoing.
Yuri Simonson
Now, the thing that's so crazy about this, which is almost like a bad Hollywood movie, is as you're researching fraud by Francesco Gino, you end up stumbling onto in the same paper, but a different part of the paper. Apparent fraud by this leading behavioral scientist, Dan Ariely. How did that come to pass?
We're talking with these younger researchers and we're looking for smoking guns. We're looking at this file, and they show us very blunt evidence of fraud. And so this other study involved car insurance, and they're self reporting to the insurer how much they drove, and that influences the premium they pay on their policy. At the end of the year, you would write down the odometer in your car and they would compute how much you drove and then adjust the policy. And what this paper was showing is that you can get people to be more honest by having them sign before they enter the odometer reading instead of after doing so. And so the data was posted, and these younger researchers noticed and showed us, look, the distribution of miles driven is perfectly uniform, between 0 and 50,000, meaning just as many people drove 1,000 miles as 3,000, 7,000, 50,000, etc. Every single number of miles equally likely, equally frequent. But there's not a single person who drove 51,000 miles. Anybody who's looked at data will immediately know that's impossible. And a lot of people who have not looked at data who just driven will realize that doesn't make any sense.
Right yeah. And so you probably presume that Gino had cheated on that too, Right? Because it's her paper. She was a co author.
We did. We did. That would have been the first smoking gun on the Gino case. That would have been. But I said, if my memory doesn't fail me, I said, this feels too clumsy. It's not like the genome stuff is brilliantly done, but this feels like worse than that. It just doesn't feel like the other studies we were looking at. Like, I was getting a flavor for, like, the fingerprint, what her data looks like. Something funny happens in the extremes of the distribution. But this uniform business is just different. And so we said, well, let's see who created the file. And we saw Dana Rielli's name there, and that was the first time we really ever thought of Dan as possibly being involved in funny business. So we contacted them, and immediately Dan said, no, if anything went wrong, none of the authors would be responsible for it, only I would be responsible for it. So he immediately took ownership of that. We had a blog post on that that drew a lot of attention. But then there were no other public data sets. And so our view at the time, and I don't think I've talked to people about this before, our view at the time was, okay, who can get to the bottom of it? And so we thought that in the insurance data, only an investigative reporter could. Somebody needed to go talk to the insurance company, was our thought, because they're.
The ones who had provided the original version of the data, which was later altered to no longer look like the original data. And you needed that comparison.
That's right. And we thought only an investigative journalist would get that. And that was actually true. A reporter for New Yorker spent considerable time, and he was able to get the data, and he was able to find, in my mind, irrefutable proof that the data were altered after they were sent to Dan.
What I think is really strange and troubling is that the investigations of potentially crooked researchers falls on the institutions that they work for, and those institutions have such strong incentives not to find them guilty. In stark contrast to the Gino case, where she's lost her tenure and Harvard's been very public about it, Duke has taken a very different stance. An investigation was done. It was done extremely secretively. Duke hasn't talked about it at all, which is interesting to me, because it let Dan Ariely himself be the voice of describing what the outcome of the investigation is. And he's no longer a named professor.
That's right.
Which is some form of punishment, but obviously a much less severe punishment than losing tenure. But I don't know, it seems to me like a failure of the institutions to police themselves.
Yeah. So a couple thoughts on that. So one is the most common outcome is secret. A secret resolution, an agreement. University says if you leave, we'll give you the this much money, you stay quiet and we're all happy. And they just say we don't comment on labor issues or determinants for employment decisions at the university. But it's worth keeping in mind comparing Dan to Francesca, that it's unlikely that Duke was able to get data of the caliber that Harvard was able to get just because of the nature of Dan's experiment versus Francesca Gino. So I'm convinced there's no room for doubt that the insurance data is fraudulent. And I don't know of an alternative explanation that's plausible to Dan having done it, but it hasn't been proven. That's just my belief based on all the evidence that's available. So it's not just because it's a man or a woman or more famous, less famous, or Duke vs. Harvard. It's not matched on the strength of evidence of wrongdoing.
Do you think that these high profile fraud cases will have or have had a big deterrent effect, scaring off others from cheating? If I were a cheater, I would be very afraid of you. But on the other hand, when punishments that get handed out are so uneven, then it really says, well, look, I might get caught, it might be embarrassing, but might not end my career, so I can do it. What's your feeling about the deterrent effect that what you're doing is having?
I don't know the facts, but I can tell you like rationally, you shouldn't be less likely to commit fraud after this experience because what you will learn is there's no real punishment. Because if the worst thing that can happen to you is that you're fired, but without fraud, you would be fired, it's still a win win for somebody like that to commit fraud. There's no real disincentive because the worst that can happen is that they don't do it anymore. And so that's why I think the rest of us have to take action to prevent it. Like to not be complicit in making it so easy for them.
You mentioned that you'd receive funding for a platform that prevents fraud.
Steve Levitt
Could you tell me more about that?
Yuri Simonson
So it's called Ask Collected. It's a spin off. So to speak of our website for pre registration, which is called AskPredicted. And the idea is some version of a data receipt. So if you go to a conference and you buy lunch somewhere and you want to be reimbursed, in most cases you don't just tell the university, I spent $7, you need to show them a receipt. But then if you tell them I collected data, they don't ask you for a receipt. And so the idea is that provide a written record of where the results come from. And that's a combination of how the data were obtained and how the data were analyzed. So the first question would be, is your data public or private? You would say it's private. And you would say, can you name the company that gave you the data? You say yes or no. If you say no, it asks you do you have a contract that prevents you from disclosing who they are? And if you say yes, it asks you who in your institution signed the contract? And then it asks you, how do you receive the data? And you say something like, I receive an email on such and such date with that spreadsheet that I analyzed. You indicate who received the data, who cleaned it and who analyzed it. And the final output is two tables. A table with the when, what and how, and a table with the who. We have experienced about 15 different cases of fraud. All of these cases would have been so much harder to pull off if you had to answer these simple questions, because now you have nowhere to hide. So the deliverable is a URL. You have a unique URL that has those two tables. And the idea is that journals hopefully will just require you at submission to include that URL. We think our customer here is deans journals granting agencies. They want to take care of fraud, but they want somebody else to take care of fraud. And so we're telling them, look, all you have to do, ask for the URL and you've done your part.
So the forces that are pushing people to cheat, either in small ways or in really outright ways, are related to the really strong incentives that exist within academics and the high hurdle for tenure. Do you think that the academic tenure system is broken or do you think it's just a pretty good system that has strong incentives? And strong incentives have a lot of benefits, which is that they make people work hard and try to be creative. And costs, which are that in extreme cases, people respond to incentives too strongly and in the wrong ways.
Many people blame the incentives for fraud and for p hacking and for all these processes, people take that lead to bad quality outcomes. I don't so much blame that. I think it makes sense for us to prefer researchers who publish over those who don't, those who come out with interesting results over who don't. That's a bit of a minority view. I'm okay with rewarding the person who's successful over the one who's not, but the part of the incentives that I think is broken is the lack of a penalty for low quality inputs. And part of the reason for that is that it's so hard for the reviewers to really evaluate their work. One way to think about the whole movement towards transparency is to make it easier to align the incentives. So given that we reward good outcomes, it's very important to make sure the inputs are legit. And the only way for people who are just doing this voluntarily to do that is that it needs to be easy for them. It needs to be easy for them to know if there was P hacking. It needs to be easy for them to know if you made a mistake. And that requires transparency.
Steve Levitt
Despite losing tenure from Harvard, Francesca Gino maintains her innocence. After Duke conducted its investigation into Dan Ariely, Ariely wrote a response that Duke approved. In it, he said that, quote, duke's administration looked thoroughly at my work and found no evidence to support claims that I falsified the data or knowingly used falsified data.
In general.
I've been a long time advocate for making data analysis a core topic in K12 education. My goal isn't to turn everyone into a data scientist, it's to equip the next generation to be thoughtful consumers of data analysis. Yuri Simonson is providing an incredibly valuable service debunking individual studies and developing strategies and tools for rooting out bad data analysis. But there's only one Yuri. We would need thousands of Yuris to keep up with the flow of misleading studies. Everyone needs to be able to protect themselves by knowing enough about data analysis to be an informed consumer and citizen. Meanwhile, if you'd like to hear more about the problem of fraud in academic research and the steps that some people are taking to fight it, check out episodes 572 and 573 of Freakonomics Radio, which you can find in your podcast.
Yuri Simonson
App.
Steve Levitt
This is the point in the show where I welcome on my producer Morgan to tackle a listener question.
H
Hi Steve, so in our last new episode we had an interview with climate scientist Kate Marvel. And at the end of the episode you polled our listeners. You wanted to know whether people were optimistic or pessimistic about our future climate, 50 years in the future. So 50 years from today, you wanted to know if A, they were optimistic or pessimistic, B, their age, and C, their country. And you have tallied up the responses?
Yuri Simonson
I have. So, as usual, we got a very.
Steve Levitt
Enthusiastic response from our listeners. So let me start with the most basic question, Morgan. Which of respondents would you say were optimistic?
H
I'm going to go against my gut, which is never a good idea, but I'm going to say 65% were optimistic.
Yuri Simonson
All right.
Steve Levitt
So the answer was 42.6%, which, when you read the responses, you just realize what a terrible question it was because nobody really knew how to answer it. And there were a fair amount of wafflers. I left those out of that calculation. So about 15% of people clearly waffled. They didn't want to take a stance. But what was interesting, and it really was what prompted the question in the first place, is that the kinds of logic and arguments and data that people sent were pretty similar of the optimistic and the pessimistic. It's just a really hard forecasting problem, and I think for a lot of people to try to make it, this black and white comparison between pessimistic and optimistic was just a really hard challenge.
H
So do you mean that people who are optimistic or pessimistic were pointing to the same information and then just coming away with different opinions about it?
Steve Levitt
Yeah, I would say the responses were remarkably thoughtful. And the people who were pessimistic gave really good arguments about why they should be pessimistic. And I think they were the kind of arguments that optimists wouldn't disagree with. And the same with the optimists. At some basic level, it's probably just not that clear whether you should be optimistic or pessimistic.
Yuri Simonson
Okay, so that's the one piece of.
Steve Levitt
Data that we collected that is really legitimate. Now, what we're going to do next is we're going to do data analysis the way psychologists did it 20 years ago. It's exactly Yuri Simonson's point in that.
Yuri Simonson
Paper about when I'm 64, I built.
Steve Levitt
In lots of degrees of freedom in my survey because I know how old people are, I know what country they're from. I can deduce their gender based on their name. But then there are also a lot of subtle dimensions, like did they respond within the first 24 or 48 hours? Did they respond in the morning or the nighttime? So in this kind of setting, you have almost infinite possibilities to try to create something interesting. When there's nothing interesting.
Yuri Simonson
And I really want to highlight that.
Steve Levitt
Because if you're a passive listener, even after this episode, that just emphasized how people with data can kind of trick you, I think there's a good chance I could have tricked you by talking about what we're going to do next. Like it's science, when really there's nothing scientific about it at all. It's just a way to try to have fun with data.
H
Okay, so what was the first lever?
Yuri Simonson
The first lever is age.
Steve Levitt
And actually, as I suspected when I did the survey, the data about the demographics to me turned out to be more interesting than the answers about optimism and pessimism. So what do you think the median age was among our responders?
H
46.
Yuri Simonson
Not bad. 42.
Steve Levitt
I was expecting younger. Okay, so then let's tackle the question. Do you think if you then divide our sample into the people who are younger than the meeting age, so younger.
Yuri Simonson
Than 42 versus older than 42, which.
Steve Levitt
Group do you think came back as more optimistic about the future of climate?
H
I think the younger people were more optimistic and older were more pessimistic.
Yuri Simonson
So that is true.
Steve Levitt
47% of the younger people were optimistic versus 39% of the older people. Now, that is not statistically significant. None of the things I'm about to tell you, anything related to pessimism or optimism turns out to be significant.
Yuri Simonson
This was actually one of those rare.
Steve Levitt
Cases where even when I tried to cut the data in a bunch of different ways, I could not find a single one that was statistically significant.
Yuri Simonson
So there was really a whole lot.
Steve Levitt
Of nothing going on in this data. So I couldn't even do gender, because, as usual, we have this incredible gender skew in the data. So this time, 84% of the respondents were men, based on my analysis of their name, given that, it turns out men were slightly more optimistic, but again, not statistically significant at all. Okay, so geography is the last one.
Yuri Simonson
I want to talk about.
Steve Levitt
What share of respondents do you think are from the United States?
H
75%.
Yuri Simonson
Yeah.
Steve Levitt
So I would have expected 2/3, because 2/3 of our downloads are in the United states. But only 49% of the respondents were American. In particular, the thing that was completely and totally crazy is this. Canada represents about 7.5% of our downloads, and over 20% of our responses were from Canada, which is just really interesting. Just to put in perspective, 40% of the women who responded were Canadian.
H
Wow.
Steve Levitt
If it weren't for the Canadian women, we would have hardly had any women at all. Now, The Canadians didn't break as either particularly pessimistic or optimistic, which is really interesting that they were engaged. The Australians were.
Yuri Simonson
Same thing.
Steve Levitt
The Australians were about three times as likely to respond as they have downloads.
H
That's not surprising. We have a very active Australian listener base.
Yuri Simonson
Yeah, that's absolutely true. So those are the only two things.
Steve Levitt
For which I found statistical significance in the entire data set, was that the Canadians and the Australians were very fervent responders.
H
Was there another lever you pulled?
Steve Levitt
Well, so I did what Yuri talked about, which is I looked at all of the cross tabs. Okay, what about foreign women or old Americans? And none of those showed anything at all. Honestly, I kind of ran out of steam after a little while trying to look at all of the different levels because the stakes were low. If I had actually done this as an experiment, invested lots of time, and I were a psychologist 20 years ago, I probably would have put a lot more effort into cutting in all the different ways because I was looking at an important publication. Whereas here I'm just trying to fill a couple minutes on a podcast.
H
Listeners, if you have a question for us, if you have a problem that could use an economic solution, if you have a guest idea for us, Our email is pimafreeconomics.com that's P I M A reconomics.com we read every email that's sent and we look forward to reading yours.
Steve Levitt
In two weeks. We're back with a brand new episode featuring Nobel Prize winning astrophysicist Adam Rees.
Yuri Simonson
His research is challenging the most basic.
Steve Levitt
Ideas we have about the universe.
Yuri Simonson
As always, thanks for listening and we'll see you back soon.
Steve Levitt
People I mostly admire is part of the Freakonomics Radio Network, which also includes Freakonomics Radio and and the Economics of Everyday Things. All our shows are produced by Stitcher and Renbud Radio. This episode was produced by Morgan Levy and mixed by Jasmine Klinger. We had research assistance from Daniel Moritz Rabson. Our theme music was composed by Luis Guerra. We can be reached@pimaeeconomics.com that's P I M A@freakonomics.com thanks for listening.
Yuri Simonson
It's funny, I forget how old I am.
Steve Levitt
The Freakonomics Radio Network the Hidden side of Everything.
H
Stitcher.
Podcast Summary: People I (Mostly) Admire – Episode 163. The Data Sleuth Taking on Shoddy Science
Title: The Data Sleuth Taking on Shoddy Science
Host: Steve Levitt
Guest: Yuri Simonson
Release Date: August 2, 2025
Podcast Network: Freakonomics Radio + Stitcher
Episode Focus: Combating fraudulent and misleading research in academia through data analysis and investigative techniques.
Steve Levitt welcomes Yuri Simonson, a behavioral science professor and member of the DataColada team, which includes Joe Simmons and Leif Nelson. DataColada is renowned for its efforts to debunk fraudulent research, call out cheaters, and identify misleading research practices in academia.
Quote [01:19-01:21]:
Levitt: "My guest today, Yuri Simonson, is a."
Quote [01:33-02:10]:
Simonson: "I did some commentary or criticism, I was asked to review a paper for a journal... I went to the original paper... It took me a while, and then I figured out what the problem was."
Yuri recounts his initial foray into identifying flawed research, sparked by a seemingly outrageous study on the impact of initials on life decisions. His meticulous analysis revealed that the study's conclusions were driven by improbable coincidences, such as couples remarrying with the same last name post-divorce.
In 2011, Simonson, along with Simmons and Nelson, published the influential paper "False Positive Psychology." The paper highlighted how common research practices, such as selective reporting and multiple testing, can lead to exaggerated statistical significance, making false hypotheses appear true.
Motivation Behind the Paper:
Quote [05:33-05:36]:
Levitt: "What was the motivation behind writing that paper?"
Quote [05:36-05:55]:
Simonson: "We were going to conferences and we were not believing what we were seeing... There's nothing alarming enough to make us update our priors."
Simonson humorously discusses a fabricated experiment from their paper where listening to "When I'm 64" by The Beatles supposedly causes time to reverse, illustrating how scientific tone can mask absurd results.
P-hacking involves manipulating data or analyses until statistically significant results are achieved. Simonson explains that practices like multiple comparisons and selective reporting inflate the probability of false positives.
Francesca Gino Case:
Simonson details the investigation into Francesca Gino's research at Harvard, where anomalies in data led to accusations of fraud. The team uncovered inconsistencies in survey responses, prompting a $25 million lawsuit against Gino and Harvard.
Dan Ariely Case:
Similarly, Simonson discusses suspicions of data manipulation in Dan Ariely's research on car insurance, where a uniform distribution of miles driven was impossible, indicating potential fraud.
Facing accusations of fraud leads to significant personal and professional strain. Simonson recounts the $25 million lawsuit filed against him, his team, and Harvard by Francesca Gino. Despite initial resistance, support from the academic community through a successful GoFundMe campaign alleviated some pressures.
Simonson critiques how academic institutions handle misconduct cases, noting inconsistencies in punitive measures. While Harvard took definitive action against Gino, Duke University handled Dan Ariely's case more discreetly, highlighting a systemic failure to uniformly police academic fraud.
Simonson emphasizes that current deterrents are insufficient, as the repercussions for fraud are often minimal compared to the incentives to cheat. He advocates for proactive measures to prevent fraud rather than merely detecting it post-factum.
Tools Developed:
P-Curve Analysis: Simonson explains the P-curve as a tool to assess the credibility of research findings by analyzing the distribution of p-values across published studies.
Ask Collected Platform: A platform designed to ensure data transparency by requiring researchers to document their data acquisition and analysis processes, making it harder to manipulate data without detection.
Simonson argues that the academic tenure system, while fostering creativity and hard work, also creates strong incentives for quantity over quality in research outputs. The lack of penalties for low-quality or fraudulent research exacerbates the problem.
To illustrate how data can be manipulated to produce desired outcomes, Levitt conducts an experiment where he surveys listeners’ optimism or pessimism about the future climate. Despite multiple data slicing attempts, none of the results yield statistically significant findings, highlighting the ease of creating misleading analyses.
Yuri Simonson underscores the necessity for systemic changes in academic research practices, advocating for greater transparency, pre-registration of studies, and open data to enhance research credibility and prevent fraud.
Upcoming Episodes and Additional Resources:
Levitt encourages listeners to explore further episodes on academic fraud, specifically episodes 572 and 573 of Freakonomics Radio, for more in-depth discussions.
Key Takeaways:
Academic Fraud is Pervasive: Common research practices can inadvertently or intentionally lead to false positives, undermining the credibility of scientific findings.
Data Transparency is Crucial: Tools like the P-curve and platforms like Ask Collected are essential in promoting honest and reproducible research.
Institutional Failures: Academic institutions often fail to uniformly address and penalize fraudulent research, perpetuating a culture where the quantity of publications is valued over quality.
Systemic Incentives Need Reform: To curb academic fraud, the incentive structures within academia must shift towards rewarding quality, transparency, and integrity in research.
Notable Quotes:
Yuri Simonson [04:56]:
"All you need to do is give yourself enough chances to get lucky."
Simonson [24:22]:
"P-curve just formalizes that. It takes all the P values... tells you you should believe it, you should not believe it, or you need more data to know."
Simonson [47:47]:
"One of the parts of the incentives that is broken is the lack of a penalty for low quality inputs."
This episode sheds light on the critical issue of research integrity in academia, highlighting the essential role data analysis and transparency play in maintaining the credibility of scientific inquiry. Yuri Simonson's work with DataColada exemplifies the ongoing battle against fraudulent research practices, advocating for systemic changes to ensure the reliability and trustworthiness of academic findings.