
Loading summary
Alan Downey
A message from McAfee.
Scott Hanson
I'd say howdy, but I'm not a.
Alan Downey
Real cowboy and I'm not a real alien.
Scott Hanson
We're deepfakes and because of fakes like us, it's hard to tell what's real unless you have McAfee. McAfee's scam detector automatically identifies text and email scams and even deep fake. And it works everywhere, even out on the range.
State Farm Announcer
Yee Hawk, you're not even a real cowboy.
Liquid IV Announcer
If they're faking it, they're not making it past us.
Alan Downey
Get award winning scam detection today.
Experian Announcer
Mcafee.com keepitreal it's okay not to be perfect with finances. Experian is your big financial friend and here to help. Did you know you can get matched with credit cards on the app? Some cards are labeled no Ding Decline which means if you're not approved, they won't hurt your credit scores. Download the Experian app for free today. Applying for no Ding Decline cards won't hurt your credit scores. If you aren't initially approved. Initial approval will result in a hard inquiry which may impact your credit scores.
State Farm Announcer
Experian this episode is brought to you by State Farm. Listening to this podcast Smart Move Being financially savvy Smart Move. Another smart move Having State Farm help you create a competitive price when you choose to bundle home and auto bundling. Just another way to save with a personal price plan like a good neighbor. State Farm is there. Prices are based on rating plans that vary by state. Coverage options are selected by the custom availability, amount of discounts, and savings and eligibility vary by state.
Experian Announcer
Welcome to the New Books Network.
Gregory McNiff
Welcome to the New Books Network. I'm your host Gregory McNiff and I'm excited to be joined by Alan Downey, the author of Probably Overthinking how to Use Data to Answer Questions, Avoid Statistical Traps, and Make Better Decisions. The book was published by University of Chicago Press in the United States in December of 2023. Alan Downey is a principal data scientist at PI Mc Labs and professor Emeritus of Computer Science at Bowling College. He is the author of Think Python, Think Phase, and Think Stats, among other books. He writes about statistics and related topics on his blog, Probably Overthinking It. I selected Probably Overthinking it because Alan shows how important interesting statistics are to our everyday lives. Even if you're not a mathematician. After reading this book, you walk away with a much greater appreciation that for how to interpret data as well as the ability to do it. Hello Alan, thank you for joining me today to Discuss your book.
Alan Downey
Thank you for having me. I'm looking forward to it.
Gregory McNiff
Perfect. Alan, why did you write? Probably overthinking it. And who is the target reader?
Alan Downey
Sure. I started writing in part because I had a lot of ideas and stories that I had collected. I think the thing that came up every time is that there's often a statistical idea and then a real world implication was always an example where I think people sometimes misinterpret the world because they're seeing it through a statistical lens and they don't always see the ways that that lens can distort the real world. But the flip side is I really want a positive message that says we really can use data to understand the world better. I don't want it all to be fear and oh, you're going to get it wrong, anything like that. I think we really can't understand the world. Data is important. So that's the idea I wanted to get across.
Gregory McNiff
Nice. I don't think anyone would disagree. The importance of data seems to become even more important every single day. It almost feels like we're inundated with it. Alan, before we get into some of these real world examples you discuss in the book, can you define a few terms? And specifically I'm thinking of a Gaussian curve, the central limit theorem, and the cumulative distribution function, the Gaussian curve.
Alan Downey
I think a lot of people are familiar with this, although they might not know it by that name. It is the bell curve, also called a normal distribution, and it is the shape of a lot of measurements that we take in the world. And the second part of what you ask about the central limit theorem is partly the reason that we see so many Gaussian distributions, which is if you have a lot of random factors and you add them up, the sum tends to follow that bell shaped curve. And so we see it if you measure people, and this is one of the examples in the book, is a data set from the military where they measure all kinds of dimensions on most people, most animals, a lot of things in the world. If you measure the length of your forearm, for example, if you get 100 people, you're going to see something that looks like a bell curve. And you can imagine what some of those random factors are. There are genetic things, there are environmental things that contribute to all of these measurements. So when you add them up, you tend to see these bell curves.
Gregory McNiff
Nice. And that's a great segue into my next question. These military tests. You talk about the answer data set as well as profoundity tests. And you conclude that, you know, with enough measurements, being weird is almost normal and maybe there is no real average. Could you talk about that?
Alan Downey
Yeah, this. This came originally from a study that the Air Force did. And they were measuring pilots in order to design a cockpit that would fit everybody. And they figured out that There were about 10 different measurements because the seat has to fit well, you have to be able to reach all of the controls. So they took these 10 measure and they wanted to find a pilot who was pretty close to average for all 10 measurements. And what they found is that out of a thousand pilots, there were none of them that would fit into a cockpit if they have to be average at everything, because everybody has. Thinking about forearms again, your forearm might be a little bit bigger than average. Your foot size might be a little bit smaller than average. It's unusual if you take 10 measurements that you're going to find anybody who's average on all 10. And then in this later data set, they took 96 measurements. And so that was, for me, an interesting way to follow up the original study with 10. And now looking at 96, you can start to tally up what are all the ways that you can be weird. So, for example, ear protrusion is one of the measurements that they took. How far your ears stick out from the side of your head. And if it's unusually large, that's one of the things that people might see that they'll know, yeah, that guy's ears stick out a little bit. But if there now there are 96 ways to be weird, the chances are that everybody is going to be outside of the mean on a few of them. And it. What happens as you get more and more measurements is everybody's weird in about the same number of ways. And so that's the title of the chapter, is that we are all equally weird. Now, it's not strictly true. I'm exaggerating a little bit, but there's a mathematical idea there. In a multidimensional probability distribution, most of the probability density lies in a thin shell that surrounds the origin. So it's not like a bullseye where all of the density is right in the middle. It's sort of like a target that is a big, thin sphere in a multidimensional space.
Gregory McNiff
That's a nice analogy. I want to move on to another idea of average versus outlier. Specifically, you ask a great question. Why do most people have fewer friends than their friends? Could you unpack that and discuss the role of the inspection paradox in explaining that dynamic?
Alan Downey
Yeah, that's the friendship paradox, which is that if you choose one of your friends at random, the chances are that that person is more popular than you are in the sense of having more friends. And it's called a paradox because it's counterintuitive when you first realize it. But then when you think about it, there's this mechanism and it's the inspection paradox, which is that you see different measurements depending on how you select the thing that you measure. So if you pick a totally random person, maybe they have a lot of friends, maybe they have a few friends, but you're going to see an unbiased distribution of friends. But now if you say, okay, I'm going to pick a random person and then I'm going to pick one of their friends, somebody who has a lot of friends is more likely to get selected. And in fact, if you have, let's say, X number of friends, the chance that you get selected is over sampled by a factor of X. And so people with a lot of friends get oversampled, people with few friends get under sampled. I mean, you can realize this in the extreme, which is if there's someone who has zero friends at all, they have zero chance of being selected. And so you're getting this subtle, biased view of what's going on in the world. And you don't realize it unless you think about that selection process.
Gregory McNiff
And what's interesting is you actually apply the same concept to, I don't want to say more serious, but significant societal decisions. And here I'm talking about the legal system, the race of incidivism. Could you briefly talk about that? Because that obviously has real world consequences for people.
Alan Downey
So that that selection process that we talked about with friends comes up over and over. And once you become aware of it, you start to see it everywhere. One example that I talked about in the chapter is when you talk about the rate of recidivism. So if someone is convicted of a crime and they serve a sentence in prison and then they're released, there's some chance that they're going to go straight, they'll never be convicted of another crime, or maybe they do commit more crimes, so that recidivism means that they go on and commit more crimes. There are two ways to measure that. And one of them is if you observe people when they are convicted for the first time and then check to see whether they are convicted of another crime in the future. If you do that, you get an unbiased measurement of the recidivism rate. But that's not how people usually do it. What they usually measure is when someone is released from prison, they might follow them for some period of time, I mean not literally follow them around, but statistically measure them and check to see whether they are convicted of another crime, let's say within a two year period. Now here's the problem. If you select people on the day that they are released from prison, you are oversampling people who commit multiple crimes because someone who serves only one sentence in prison only gets released once. Somebody who serves 10 sentences gets released 10 times. And so if you sample by in some sense watching people go out the prison gates, you're over sampling recidivists. And so you will overestimate the rate of recidivism. And people use this number all the time when they talk about criminal justice, they talk about that if the recidivism rate is very high, they might say, well, we need to keep people in prison longer because it's not working. And that's misleading because you are overestimating the actual rate of recidivism.
Gregory McNiff
Yeah, I really. That was the second most alarming stat or conclusion you drew. And I want to get to number one down the road, but I just want to cite or read what you say in the book, that in the individual base sample Most prisoners serve one sentence, only 28% are recidivus. But when you go to an event based sample, the rate rises to 49%. So like man, that's a dramatic difference. And that obviously could have implications for programs and for sentencing and again, real societal impact. So I just thought that was fascinating the way you, the way you teased that out. And you also give other examples, I think around Covid super spreaders road rage as well, that this inspection paradox plays a role in how we, I think how we perceive the rate of an event or the frequency of an event. I want to. What is probably the most counterintuitive chapter of your book is something you describe or I believe is known as Preston's paradox, in which case a population can remain stable or even grow despite women having fewer children than their mother. Could you talk about how that's possible? Because I had to read that chapter twice. I did not understand how we can grow if each successful generation of women are having less children than their mother.
Alan Downey
This is counterintuitive and I agree. It took me a while to get my head around it. This comes from a demographer, Samuel Preston, who wrote a paper describing the same selection process that we were just talking about. When you apply it to families and so for example, if you survey women and ask how many children they've had, you will get an unbiased view of that fertility rate. But if you survey a child and ask about their mother, you will oversample large families. Because if someone has four children, there are four people out there who might appear in your random sample. If they have zero children, they have no chance of appearing in that sample. So it's the same pattern. You're over sampling large families. The example that I gave in the book is one that Preston talks about in his paper, which is you could imagine a policy, and this is a little bit similar to China's one child policy, but it's a little bit different. And the idea is, what if you made a policy and it says that every woman must have one child fewer than her mother had, you would think that family sizes would just keep getting smaller and smaller until nobody has any children at all. And here's the counterintuitive part, which is you could imagine, let's say there are three families and they have one child, two child and three children. The only child, if they have fewer than their mother, would have no children. If you grew up in a family of two, you would only have one child each, but there are two of you. And if you grow up in a family of three, you'll only have two, two children each, but there are three of you. So you actually have two forces pulling in opposite directions. One of them is the one child fewer, but going in the opposite direction is that big families have a lot of children. And so if big families replicate themselves, then that tends to grow over time. So you've got a shrinking force and a growing force and it turns out that the growing force wins.
Gregory McNiff
Yeah, you do a really nice job, as you did right here, explaining that in the book. But again, just counterintuitive and I think goes to your overall thesis that understanding the data and having these frameworks or concepts to evaluate these type of numerical or data driven situations is very important. I think as China's learning, I think you end the chapter by saying they've really the pendulum has swung the other way, that they're trying to accelerate or increase the birth rate, but it's not having quite the effect they were hoping. So interesting. Another dramatic real world example. We talked earlier about the Gaussian curve and as you mentioned, it's a bell curve that we're all familiar with, with hype sort of being the obvious example. But you actually look at other human traits, particularly weight and speed, and you conclude that in fact, the Gaussian curve or the Bell curve isn't the right framework per se, but correct me if I'm wrong, but it should be more of a lognormal approach. What does that mean, and what's the implications there for how we look at certain qualities that require lognormal distributions?
Alan Downey
The lognormal distribution comes up a lot, and partly for the same reason that the Gaussian does. I mentioned earlier that when you add up a bunch of random factors, the sum tends to look like a bell curve. There are other scenarios where instead of adding things up, they tend to multiply together. And when you have a bunch of random factors and you multiply them, it turns out that the central limit theorem applies, but it applies on a logarithmic scale. And so what you get is a bell curve on a log scale. Which may be a simpler way to say that is, if you take something like weight, adult weights, and you try to fit a bell curve to them, it doesn't fit very well. If you take the logarithms of people's weights, it turns out that that does fit a bell curve. And if the logarithms follow a bell curve, then the weights themselves follow a log normal distribution. And this is another one of those things where once you start looking for it, you start to see it everywhere. There are a lot of medical measurements, like when you do blood tests and you measure concentrations of different chemicals in your blood, those tend to follow lognormal distributions. You mentioned weight, which does. Running speed, it turns out, does. This is part of the reason that Usain Bolt is so much faster than you are. Is that running speed? If it was a bell curve, the fastest people just wouldn't be that fast. The tail of a bell curve just isn't that far out there. But with a lot of sports and a lot of achievements, almost anything that requires continued development and achievement over the course of your life will follow this log normal curve that has a long tail that goes out toward high performance. And the consequence is that the best performers are not a little bit better than average. They are out there in their own world.
Gregory McNiff
You nailed that chapter very well, Alan. And what I like is you had two nice takeaways on the weight discussion, and maybe you could unpack this just a little more. I think you answered, but you said, we're born Gaussian and we grow up log normal. What exactly do you mean there with respect to weight?
Alan Downey
There's a literal sense where that's true, which is that birth weight, your weight at the moment, that you're born follows a bell curve, an adult weight follows a log normal. And I speculate in the book that that process is something multiplicative where over the course of your life your weight gain, like during the years that you're growing, and then later as an adult, if you're continuing to gain weight, it tends to be proportional. The amount that you gain each year, for example, is going to be proportional to your current weight or at the beginning of the year. And so instead of adding up a bunch of factors, you are multiplying these percentage increases. So those are multiplicative factors. And so I conjecture that that's where the shape comes from for weights. There are other things where I think there's a different mechanism going on which I call the weakest link mechanism. And this is, I think running is a good example, which is in order to be as fast as Usain Bolt, you have to have all of the different factors, the genetic things, the environment things, the opportunity to pursue that as a sport. And if all of those things are in place, you can be a very fast runner. And if any one of them, if you are below average at any one of those things, you will not be a world class runner. So it's a little bit like multiplying all these factors because if any one of them is small, then the product is small. And so I think that's where the shape of that curve potentially comes from.
Scott Hanson
I'm Scott Hanson, host of NFL Red Zone. Lowe's knows Sundays hit different when you earn them, we've got you covered with outdoor power equipment from Cobalt and everything you need to weatherproof your deck with Trex decking. Plus with lawn care from Scotts and of course, Pit Boss grills and accessories, you can get a home field advantage all season long. So get to Lowe's, get it done and earn your Sunday Lowe's official partner of the NFL.
Best Western Announcer
Hit pause on whatever you're listening to and hit play on your next adventure. This is fall. Get double points on every qualified stay. Life's the trip. Make the most of it at Best Western, visit BestWestern.com for complete terms and conditions.
Liquid IV Announcer
Imagine fast hydration combined with balanced energy. Perfectly flavored with zero artificial sweeteners. Introducing Liquid Ivy's new energy multiplier. Sugar free, unlike other energy energy drinks, you know the ones that make you feel like you're glitching. It's made with natural caffeine and electrolytes so you get the boost without the burnout. Liquid IV's new energy multiplier Sugar free hydrating energy. Tap the banner to learn more.
Gregory McNiff
No, that. It's funny, I literally had a similar insight, nowhere near as quantitative as you just articulated. But a few years ago we ranked all our associates on competency, communication and the ability to anticipate. And we originally used a 1 to 10 ranking and added the numbers. And very similar to what you said about the goats, that they're outliers to outliers, it was a little frustrating because the difference between 1, 2 and 3, which was much more, was higher than the additive points would suggest. And we actually came to the conclusion that we needed to multiply such that if you had a 6, a 6 and a 6 versus a 7, a 7 and a 4, if I'm doing my numbers right, you know, 18 being the additive sum for those three metrics, the multiplicative associate was much more valuable, we concluded, because if you're dropping any one of those requirements, like you just said, if you're not communicating at all or your work isn't competent or you don't anticipate at all, such that you have to be micromanaged, it just takes away dramatically from the other areas that you are qualified in anyway. It's a little bit of, a, little bit of a detour. But I found that chapter very interesting and just want to hit you with one final there. On this notion of goats and outliers, of outliers, you referenced Gladwell's claim, and I think a lot of us are familiar with this, that it roughly takes about 10,000 hours of practice to be an expert. How do you evaluate that within this discussion of sort of this log normal Gaussian curve approach to the goats or to the outliers of the outliers?
Alan Downey
You know, it's an appealing idea that if you spend 10,000 hours at something, you can be a world class violinist or a chess player, or you know, you could be a runner and you know, world competitive runner. The what the. The number that Gladwell cited comes from research on experts. And what they found is that of the people who are really at the top of their field and violinists was one of the groups that they looked at. Very few of them had practiced for fewer than 10,000 hours in their lives. And what that means is that nobody gets to that level on talent alone. You're not born with the ability to be a world class musician. It takes 10,000 hours of practice. I think that's true. The problem is that what people took away from that is that 10,000 hours is sufficient. And what Gladwell was saying about the research is that it is necessary. Nobody gets to that level without 10,000 hours. But you also don't get to that level unless you have natural talent, unless you have the opportunity to pursue your natural talent and probably several other attributes. Personality, you need persistence, you need focus. So even if you have natural talent, if you don't have the opportunity, you won't get there. If you have talent but not the tenacity to get there, you won't get there. It's all of those things. So that's why I was using this multiplicative model similar to your associates. If any of those things are deficient, you're probably not going to get to that level.
Gregory McNiff
Yeah, no, that was a great discussion. And it sort of segues, I think, nicely into my next question for you. When we are deciding what we're really good at and what direction we should go in, you know, career being the obvious example, you cite a framework called the significance persistence contingency framework. How can someone use that to decide what job they should take?
Alan Downey
Right. This comes from some of the people who are behind 80,000 hours, which is a website and a book encouraging people to think about their work career, which is roughly 80,000 hours for most people. If you're going to spend that time, spend it on something that has positive impact in the world. And they suggest this framework of significance, persistence and. And contingency as one way to think about this. I'll start backwards from contingency. That is, if you are going to take on a job, you could think about what is the positive impact that I can have and what would happen if I didn't do that. That's the contingent part. Because you could say, look, I want to spend my time on preschool education. And for me, the contingency for that is not very good. I'm probably not a good preschool teacher, and there are other people who are much better than me. So on a contingent, it's actually better if somebody else does it than if I do. But let's say that I decide that preschool education is the most important thing in the world. The next thing I could think about is what's the significance? So what's the positive impact? And you could think about all the downstream life outcomes that come from good preschool education and also persistence. So let's say that we do it, we improve preschool education. How long will that effect blast out into the future? And the suggestion is all three of those factors are important. If any of them is low, the product is going to be low. So you really want to think about all three when you're making these career decisions.
Gregory McNiff
Yeah, no, that's a great. A great approach is, again, some pretty big decisions. Another area you address in the book is what I call survival outcome, and maybe in layman's terms, just how long things or people last. And here again, you come to some interesting and I would say, counterintuitive insights. On the one hand, you use a term nbu, nbue to talk about light bulbs, which is, you know, they have pretty much a determinant life and, you know, the newer beats the old. And then on the other hand, with something like, you cite a certain case of cancer, the longer you survive, the longer you. You could possibly survive. So could you sort of talk about those two approaches? And for that latter one, I use the term nwue in explaining those two terms and how to think about survival or lasting outcomes.
Alan Downey
The nomenclature here is maybe not great. So NBUE is new, better than used in expectation. And the expectation means on average. And that's what most people. Most things are like if you buy a car or a light bulb or anything mechanical that wears out over time, you would rather have a new one than an old one, because the remaining lifespan of the new thing is probably longer. There are a lot of circumstances, though, where that's not the case, where used is actually better than new, because once something has survived for a period of time, it has demonstrated its longevity. And maybe you have used up some of the longevity during that observation period, but you've also learned that, oh, this is probably a good case. This probably is going to survive longer. I learned this when I started riding a motorcycle, which is that someone who rides a motorcycle and learns for the first time their life expectancy on the day they get on the bike is about as short as it's ever going to be because they're still not very good at it. If they are still around five years later, their life expectancy is longer. And that's partly because we've learned two things. During that time, that person has gained skills, so they're probably better. But we can also infer that they're probably not super reckless. So because they've survived for five years, our new estimate is longer. And that is often true for some cancers, but not all. When someone is diagnosed, their life expectancy at that point might be quite short because we don't know yet whether that cancer will respond to treatment. For example, after a year or two years, if someone has survived, that tells us something. We've learned that it was probably responsive, or maybe it was not a severe case, or maybe this person was in general good health. And so all of those things, because they have survived for one year or two, means that their life expectancy is now longer than it was on the day that they were diagnosed.
Gregory McNiff
Yeah. Again, another great chapter, and I think we're all familiar with that Stephen Jay Gould essay. The median isn't the message. Where I believe he was diagnosed with a certain form of cancer that only had 8 months average survival rate, he went on to live another 20 years. So I felt like the takeaway there is understand the data, what you've been diagnosed with, and maybe reach out to someone like yourself to make sure exactly what the outcome, the potential outcomes could be. Again, another interesting law, and candidly, I know you just reference it and say we don't fully understand it, but it's something called Gompert's Law. I just found that kind of cool. Could you talk briefly about what that law is and it's how it's exhibited in our data patterns?
Alan Downey
Yeah, it's an empirical law, which means that we see it in the data. But it's not easy to say why the curve should have this particular shape or something else. It comes up if you look at mortality rates as a function of age, and it follows what's called a bathtub curve, because if you look at newborn and you know, maybe a couple of months old, the mortality rate is still somewhat high. If you survive to be one or two years old over the next few years, like when you are between, let's say 5 and 15, your mortality rate is about as low as it ever will be. That's the very low mortality section. Then it starts to increase in young adulthood, and then from about age 30 on up, it grows exponentially. And that's, that's Gompert's observation, which is if you plot this bathtub curve on a log scale, that exponential growth is a straight line. And so from about age 30 on up, your risk of dying in any given year is growing exponentially.
Gregory McNiff
Yeah, and I think even you're talking about, it's not specific to just humans, but there's been other areas of either biological or man made ecosystems where we've seen that law exhibited as well. Okay, and again, now I really want to move to a big part of the book here. It's this idea of medical studies and appropriately interpreting medical data, medical surveys, and sort of front and center is this paradox called Berkson's Paradox. Could you talk about that and why it's so key to keep that in mind when interpreting any type of survey outcome group study, white papers.
Alan Downey
Berkson's paradox comes up almost anytime that you select a sample because they have a certain attribute that you are interested in. It came up at first in a hospital. If you observe patients when they arrive at a hospital and you study the different diseases that they might have, there's a pattern that comes up over and over, which is that almost any two diseases will appear to be anticorrelated. If you have this, then you are less likely to have that. And it's misleading because that only appears in your hospital sample. It's not true in the general population. Berkson is the person who observed this. He was looking at two particular diseases and again saw this anti correlated pattern. And your first thought is that, well, maybe having this disease is good because it prevents that other disease. So it's drawing the arrows of causation and trying to figure out is it because one disease prevents the other or is there something else going on that is driving both of those things. And in this case, it's the way you selected the data. You selected people who came to the hospital because they had a disease. So there's a reason that they got there. And let's pretend that there are only two diseases in the world. If it's not one of them, it's more likely to be the other one. And if you didn't, you know, if you didn't have either of them, you wouldn't be in the hospital in the first place. But the fact that you're in the hospital is already a selection process and it can be very misleading.
Gregory McNiff
In fact, you open the book by talking about the correlation between low birth rates and mothers who smoke. And a specific paper that I think many concluded actually set back our understanding of the harmful effects of smoking while pregnant. And talk about a real world example. Could you briefly talk about how it took us, I think, even decades to understand that this wasn't a causal relationship and that again, you really had to dig into the data to understand why babies who born from mothers who smoke, please correct me if I'm wrong, had showed less birth defects than babies that were born from mothers who didn't smoke, which is very counterintuitive.
Alan Downey
I started the book exactly for the reason that you said this is one of the cases where we got it wrong. And it literally took decades to figure out what went wrong. It came from a paper in the 1970s. It was a researcher at Berkeley who was studying low birth weight Babies and the effect of maternal smoking. And he found at first all of the usual patterns that he expected to find, which is that if mother smokes during pregnancy, the baby is more likely to be born underweight, low birth weight, I think the threshold is about 2,000 kilograms. Sorry, 2,000 grams. So maternal smoking seems to cause low birth weight and low birth weight is associated with higher mortality. So all of that was as expected. But here's the part where things went wrong. If you select low birth weight babies and you ask whether the mother smoked or not, it turned out that the low birth weight babies of smokers had a better chance of survival, a lower mortality rate than if the mother did not smoke. And the conclusion that he came to is maybe for a low birth weight baby, maternal smoking is actually good, it's actually preventative. And he said, well, it's not true in general for, for babies who are not low birth weight, maternal smoking is bad, but maybe for low birth weight babies, maternal smoking is good. And he sent a letter to the US Senate. He I think corresponded in the UK with lawmakers there. His article got a lot of coverage in the press where, where the headline would say, hey, maybe maternal smoking is not so bad. In retrospective in 2014, several authors who looked back at this history said that that single paper and the publicity that came from it probably set back by a decade our efforts to reduce maternal smoking. And the whole thing is completely wrong. It's just a straighten up statistical error.
Gregory McNiff
Yeah, I mean that really, that was amazing reading and how you unpack that. Again, just like you said, we just read the data wrong. And the implications there, I'm sure candidly were pretty serious. I mean, as you note, someone actually said it's a back to all our whole understanding of the impact of smoking by a decade. I do want to move to the next section, which I actually did find the most alarming. And this is long tail distributions, particularly as they relate to the probability of disasters. Why do we have trouble modeling these long tail distributions? And I have a follow up for you about your model versus the U.S. geological Survey.
Alan Downey
So long tail distributions come up in a lot of domains. In particular, there are many things in the natural world where the most common case is something very small. Occasionally something comes along that is, let's say 10 times bigger. And then sometimes you get 100 times bigger, sometimes you get a thousand times bigger and you see this pattern where every time you go out by another factor of 10, the probability goes down by a factor of 10. So you have these very rare events but they're very, very big. And I think earthquakes are probably a good example because people are familiar with the Richter scale, that there are very small earthquakes. A 2 or a 3 on that scale happens all the time because they're barely perceptible in California. They happen literally every day. Fours and fives, those are earthquakes that you can feel, they are less common. And then things from seven on up are major earthquakes. They tend to cause a lot of damage, and fortunately they're quite rare. But they're in places that are earthquake active. A seven or higher is possibly one time in a decade, maybe one time in 50 years. And then on up there are eights. And I think the biggest ever might have been a nine. I'm not positive about that, but that's kind of what that scale is like. So now the challenge is, can we predict how often those very large, very rare events will happen? And the answer is sometimes we can and sometimes we can't. The fundamental problem is that they are so rare that we just don't have the data. If something happens once every thousand years, you know, it probably has not happened during the time that we have had seismometers and other records of the data. So we're always extrapolating from the data that we have out into very rare, very large events. In the case of earthquakes, we have pretty good models. There are other examples that I mention in the book, and I think Asteroids is one of them, where we really don't. And solar flares, we really don't have enough data to make a reliable prediction about the possibility of a very large solar flare, for example.
Gregory McNiff
Now I want to ask you a follow up, because on your earthquake analysis, you put forward the log T model to Compare with the U.S. geological Surveys Model. And as you note, with earthquakes greater than 6 on the Richter scale, your model is not only more accurate in predicting, but it actually estimates the probability of these events to be much higher than the U.S. geological Survey. And as you note in the book, their model was developed by leading scientific experts from fields of seismology, geology, earthquake physics and earthquake engineering. And you're alone, as you describe yourself, a lone statistics or data scientist professional. Should we be concerned that you A have a more accurate model and B, it emphasizes the impact of these tail events, Richter at the six level or above, much more frequently than their model?
Alan Downey
No, I'm a crackpot. So we should not take that part of the chapter too seriously. Partly because I did just one example where I took one data set and I fit a curve to it and the USGS has a different model that they use when they make these predictions and they actually know what they're doing. And I am just applying data exploratory methods. And in this particular case, as you said, my model is a little bit more accurate than theirs and my model predicts that the probability of large earthquakes is a little bit higher than what their model says. But honestly, if you need to make decisions and you're trying to decide whether you should listen to me or the usgs, you should listen to them. However, I would suggest, I think that the model that I used in the book is worth exploring a little bit more. I would love it if someone who is in this area takes it up and starts to look at more data sets, not just the one that I looked at, and find out, is this a generalizable result? Is it generally true that the model that I used to fits the data better than the conventional models that are used in that field? And if so, maybe they should start thinking about, you know, I'm not imagining that they're going to suddenly replace everything that they've done with this technique that I used in one chapter of one book. But I would be delighted if someone looked at this a little bit and saw whether there's any merit to it at all. And, you know, the answer might be no. In fact, I think it's very likely that the answer is that they actually know what they're doing and they don't need any help from me.
Liquid IV Announcer
Wish you could become a morning person. You know the type. Up before the sun, early morning runs. First one to the office with donuts and a smile. How do they do it? Easy with a new Galaxy Watch 8. Sleep tracking and personalized insights from Samsung Health help you improve so you can wake up to a whole new you. One who, dare I say it, skips the snooze.
Best Western Announcer
It's possible.
Liquid IV Announcer
Train your sleep with Galaxy Watch 8 learn more at samsung.com requires compatible Samsung Galaxy phone, Samsung Health app and Samsung account.
Best Western Announcer
Eczema isn't always obvious, but it's real. And so is the relief from EBGLIS. After an initial dosing phase, about 4 in 10 people taking EVGLIS achieved in relief and clear or almost clear skin at 16 weeks. And most of those people maintain skin that's still more clear at one year with monthly dosing.
EBGLIS Announcer
EBGLIS Librekizumab LBKZ a 250mg per 2ml injection, is a prescription medicine used to treat adults and children 12 years of age and older who weigh at least 88 pounds or 40 kilograms with moderate to severe eczema, also called atopic dermatitis that is not well controlled with prescription therapies used on the skin or topicals or who cannot use topical therapies. EBGLISS can be used with or without topical corticosteroids. Don't use if you're allergic to Epglis. Allergic reactions can occur that can be severe. Eye problems can occur. Tell your doctor if you have new or worsening eye problems. You should not receive a live vaccine when treated with Epglis. Before starting Epglis, tell your doctor if you have a parasitic infection searching for real relief.
Best Western Announcer
Ask your doctor about eglis and visit ebglis.lilly.com or call 1-800-lilyrx or 1-800-545-5979.
Experian Announcer
Jack Daniels is proudly served in fine establishments, questionable joints, and everywhere in between. So no matter where you go in every bar, you'll always know someone by name.
Alan Downey
Jack Jack and Coke shot at Jack. Jack Daniels, please.
Experian Announcer
Right away.
Scott Hanson
That's what makes Jack Jack please drink responsibly. Responsibility.org Jack Daniels and Old Number 7 are registered trademarks. Tennessee Whiskey 40 alcohol by volume Jack Daniel Distillery, Lynchburg, Tennessee that is interesting.
Gregory McNiff
Because as you point out in the book, you do use the data before 2015, but your model actually predicts again, these sort of tail events much more accurately than theirs. But I will defer to you on your own model. Hopefully somebody will at least reach out to you. Moving on. And again, I know the medical field seems to be so ripe for demystifying any type of test. And there I think we've all heard the term sensitivity and specificity. Could you describe them and talk about what the base rate fallacy teaches us?
Alan Downey
This is tricky because this comes from almost any kind of medical test where there are just two outcomes. Usually, you know, a positive test means that you have the condition that they were testing for, and when you get a positive test, there are these two numbers that you really should know. The sensitivity is the probability that you will correctly get a positive test if you do in fact have the condition. And specificity is the probability that you will correctly get a negative test if you do not have the condition. And it's that second one, specificity is really important because specificity determines how often that test will produce a false positive. And that's a worry. Anytime you take a medical test, if you get a false positive, first of all, you're going to Be worried you're going to have to do maybe more testing, maybe medical interventions. It tends to be expensive, it tends to have side effects. So false positives can be very costly and it is very hard to get specificity to be as small as it needs to be, particularly in the case of anything that's like a screening test. And here's where I think probably the most important thing for people to know is the difference between a diagnostic test and a screening test. So diagnostic means you have presented to a doctor, you have a symptom, and they are trying to figure out what you have. And based on the symptom, they think, oh, there's a pretty good chance that you have. Let's take Covid as our example, because this is the example in the book. You have a fever, you have a cough, you have all the symptoms of COVID So you take a COVID test, chances are if that test is positive, that it is correct, it is a true positive. But the flip side of that is if you have no symptoms at all and you haven't been exposed and you're in an area that has a low rate of, of COVID infection, the probability that you have Covid could be one in a thousand or less. So now if you take a test that's like a screening test, and we were doing a lot of this during COVID we were testing a lot of people who are completely asymptomatic. And that is a somewhat risky thing to do because the chances of false positives become much higher. So let's think about that example. Let's suppose that the specificity of the test is 99.9, and it actually wasn't quite that good. But let's be generous. What that means is that there's a one in a thousand chance of getting a false positive. If you're negative, you'll get a positive test one time out of 1000. But the flip side is the chance that you actually have the condition, in this case, completely asymptomatic, haven't been exposed. Let's say that that's one in a thousand. So there's only a one in a thousand chance of getting a true positive. So now, out of a thousand people, there will be only one true positive, but there will also be one true negative, sorry, one false positive. There will be one true positive and one false positive. So now if you get the positive test, what that means is that there's a 50, 50 chance that it is true or false. And that is a much lower probability that 50, 50 is much lower than what people imagine.
Gregory McNiff
Yeah, that goes back to. I mean, sort of the whole theme of your book here is understand the data you're being presented with and how it was analyzed. You talked about COVID Any discussion of COVID invites conversation around vaccines. I thought this was another important area that you delved into, particularly as it relates to Simpson's paradox. Could you talk a little bit about that in terms of how we determine the efficacy of vaccines?
Alan Downey
This was a difficult example, and I'm still not sure how I feel about it. So let me. I'll outline it and then we can talk about it. There was a journalist who went on Joe Rogan's show and he showed statistics that came from the United Kingdom at the time. And what he showed is that of all the people who had died of COVID during a particular period in time, 70% of them had been vaccinated and only 30% of them had not been vaccinated. And his conclusion is that the vaccine was actually causing harm, that the 70 to 30 ratio there means that the vaccine was increasing the probability of mortality. And unfortunately, that's the idea that got out there. Now, within 24 hours, there were three epidemiologists who published blog posts, and they all explained why this was wrong. And they did a really nice job of explaining it. And it is a version of Simpson's paradox. And the issue was that at that time in the uk, they were vaccinating old people. So most of the people who were vaccinated were old. And being old was a much higher risk factor for mortality than being vaccinated. And it turns out you can take the same numbers that this guy reported, and if you analyze them correctly, you will see that the vaccine actually reduced mortality. I estimated that at that time, it was probably saving 7,000 lives per month, just in the UK, which is not that big of a country. And the line that I ended that chapter on is, if you ever have the opportunity to save 7,000 lives in a month, you should probably take that opportunity. Now, unfortunately, when you read the stats wrong and you go on a popular podcast and get the stats wrong, you're misleading people. They're just not understanding the world. And when we misunderstand the world, we do not make good decisions.
Gregory McNiff
Yeah, you nailed it. I mean, I think again, another area where a really important decision is being based on interpreting data, and getting it right is key. So, wonderful example there you end the book with a concept called Overton's window, which is the range of socially acceptable ideas. Could you talk about how that shifts over time. And then I have a follow up question for you on just how we interpret that shift.
Alan Downey
This came when I was looking at examples of Simpson's paradox. And one of the ones that comes up all the time is how attitudes and beliefs around social questions change over time. And one of the examples is just thinking in terms of the conservative versus liberal worldview on social issues and economic issues. I looked mostly at social issues because I was working with data from the General Social Survey and I identified 15 questions where conservatives and liberals tend to give different answers. This gave me the ability to estimate how conservative each of the survey respondents is and to see how that changes over time. And what I found was a confusing pattern that I called it Overton's paradox, because as you said, Overton's window is the set of ideas that are politically in the mainstream at any point in time. And when you get outside of the Overton window, you get to ideas that are considered fringe. And if you go even farther out, there are ideas that are just considered unacceptable. And that window shifts over time. And just to give an example, in the General Social Survey, when they started in 1974, one of the questions that they asked people was whether they thought that interracial marriage was acceptable. And in 1974, there was a substantial. I don't think it was a majority, but it was a substantial number of people who did not who were opposed to interracial marriage. And that number is now down in the small single digits. The vast majority of people think there's absolutely nothing wrong with interracial marriage. So that's an example of something that was mainstream in 1974 that is somewhere between fringe and unacceptable now. So here's what happens if you ask people whether they consider themselves to be conservative, older people are more likely to say yes. And so it's tempting to think that people get more conservative as they get older, but they actually don't. When you look at these 15 questions and you track what happens to the group over time. So let's say you follow people who are born in the 1940s and you follow them from when they entered the survey as a young adult and now I guess they're in their 80s. So you can track them over time. And what you find is that it's not a big change for that particular group. People born in the 40s, it went up a little bit and then down a little bit. And there are a little bit more liberal now than they were when they were surveyed in the 70s. But they say that they're more conservative. And the reason is because the Overton window moves. So somebody who was a little bit left of center in 1970 is going to find themselves substantially right of center now, even if their views have not changed at all. And I use this time machine example. If you go to 1970 with your time machine and pick the average liberal and you bring them to the year 2000, that same person will be just about middle of the road. And if you bring them to 2020, they will be significantly right of center.
Gregory McNiff
And Alan, is the implication there that society as a whole is moving, I guess don't want to say left, but becoming increasingly liberal, is shifting to a liberal stance over time. Does that explain some of that paradox?
Alan Downey
Yes, that is mostly right and that's there are two things happening there. And one is that people change their minds and the other is generational replacement, which is, you know, over time old people tend to die off and younger people now enter into this survey age of young adults and that that young person who replaces an old person is very likely to hold more liberal views. So that moves the average over time. And I'll mention again, I was mostly looking at social issues with economic issues, things are a little bit different. But one example, for example, the attitude toward homosexuality. In the 1970s, one of the questions was do you, do you think that same sex sexual relationships are wrong or not? And in the 1970s, it was something like 75% of the population said that it was wrong. Now the most recent round of the survey from 2024, I think it is in the 20s, 20% and that says that it's wrong. So 75% roughly that have no problem with homosexuality. So that has almost completely flipped from 75 wrong to 75 no problem. So that's fast as social change goes. That's about as fast as things ever go. But that's an example where somebody in 1970 who was opposed to homosexuality was very much in the mainstream. They were the majority, and now they would find themselves right of center and they are in the minority even if they never changed their minds.
Gregory McNiff
And you in your book attribute this shift to both what you call the period effect and the cohort effect, the period effect being effectively they're changing their minds and the cohort effect is this generational replacement. And I found interesting, you actually suggest the cohort effect is five times more, carries more weight than the period effect. So I interpret that as, you know, you can't fight demographics at some point. Maybe that's the wrong conclusion. But time Marches on. And with that are what we deem socially acceptable. And that carries far more weight than just trying to change individual minds. But I want to give a chance to respond to that.
Alan Downey
That is often true. But none of these things are like physical laws that can. They can always be different. There are a few examples. So what you said is right. Generational replacement is a powerful force. And it's pretty hard for things to go in the opposite direction. But you can speed it up or slow it down if minds are changing in either direction. The example that I gave of attitudes toward homosexuality, that's a case where both of them were going the same direction. So they added up. But there are exceptions and there are also sometimes reversals. We're seeing some of them now. Young people historically have generally been more liberal than older generations. And that certainly has been true in the general social survey Lately. We're seeing some unusual patterns. We're seeing young people adopting conservative attitudes on certain questions, certainly not all. And that's a reminder that social science does not follow physical laws. And things change.
Gregory McNiff
Yeah, no. It's a fascinating discussion. Throughout this interview we've talked about examples where interpreting the data properly is key. The legal settings, recidivism, any type of medical diagnosis. A number of examples you've given. How do you suggest the public adopt a more data based approach. Approach to these type of decisions? I mean, if you could implement one or two policy decisions or even just creating a course at a university, what do you. What would you do to give the government? I'm sorry, to give the public, equip them better to interpret data.
Alan Downey
I think this can happen in a lot of places. You mentioned universities. I think they could do more when they teach statistics. There is a very common university level course that is mostly an introduction to mathematical statistics and hypothesis testing. And that framework, I think that could be replaced with something that looks more like data literacy at the college level. I think at the K12 level this is already happening. Several states are adopting requirements in statistics. I would love to see more of that. I think there's a lot of the math curriculum, especially calculus, that is not the most important thing for people to learn. And replacing especially calculus with data literacy, I think would go a long way. I would love to see more of that. The other really positive force that I see is data journalism. If you look at the kinds of analysis that journalists are doing now, compared to a decade ago or, you know, longer ago, it's totally different. They're not just reporting what researchers do, they are conducting their own surveys, they're doing their own analysis. They're publishing data visualizations, especially interactive visualizations online that are presenting data to the public in a way that's unlike what we've seen before. And I think in a sneaky way, they are actually making people more data literate over time. Just by consuming any kind of data journalism, you're learning to interpret graphs, you're learning to read and understand data better. So that's my hope. I think that's a positive force.
Gregory McNiff
Yeah, no, that's a great answer. You're absolutely right. It does feel like journalism is trying to slowly educate us, at least recently relative to past decades, on data consumption and analysis. But great answer and a great way to end the interview. Again, the book is probably overthinking it. How to Use Data to Answer Questions, Avoid Statistical Traps and Make Better Decisions by Alan Downey. Alan, thank you so much for your time and thanks for writing such a thought provoking and candidly needed book.
Alan Downey
Thank you. It's been a pleasure talking with you and Doug. Here we have the Limu Emu in its natural habitat helping people customize their car insurance and save hundreds with Liberty Mutual. Fascinating. It's accompanied by his natural ally, Doug.
Gregory McNiff
Limu is that guy with the binoculars.
Alan Downey
Regulars watching us. Cut the camera. They see us.
Scott Hanson
Only pay for what you need@libertymutual.com Liberty.
Alan Downey
Liberty Liberty Liberty Savings Very underwritten by.
Scott Hanson
Liberty Mutual Insurance Company affiliates. Excludes Massachusetts.
Podcast: New Books Network
Host: Gregory McNiff
Guest: Allen B. Downey
Book Discussed: Probably Overthinking It: How to Use Data to Answer Questions, Avoid Statistical Traps, and Make Better Decisions (University of Chicago Press, 2023)
Date: October 10, 2025
This episode features a deep dive into Allen B. Downey's Probably Overthinking It, a book that explores how data and statistics shape our understanding of the world. Downey and host Gregory McNiff discuss practical frameworks for interpreting data, common statistical pitfalls, and real-world implications for fields ranging from medicine to criminal justice. The conversation is wide-ranging, filled with vivid examples, and ultimately serves as a guide for making better decisions through data literacy.
[02:30]
[03:53]
[05:19]
[08:04]
[13:10]
[16:35]
[23:35]
[25:46]
[28:21]
[31:25]
[33:14]
[38:47]
[46:41]
[50:37]
[53:19]
On the friendship paradox:
“If you choose one of your friends at random, chances are that person is more popular than you are... it's called a paradox because it's counterintuitive.” – Allen Downey ([08:04])
On modeling disasters:
“The probability of large earthquakes is a little bit higher than what their model says... But honestly, if you need to make decisions, you should listen to [the USGS].” – Allen Downey ([42:10])
On base rate fallacy in medical testing:
“If you get the positive test, what that means is that there's a 50/50 chance that it is true or false. And that is much lower than what people imagine.” – Allen Downey ([46:41])
On Simpson’s paradox and vaccines:
“Most of the people who were vaccinated were old, and being old was a much higher risk factor for mortality than being vaccinated... [analyzing correctly] the vaccine actually reduced mortality.” – Allen Downey ([50:37])
On Overton's window and social change:
“Somebody who was a little bit left of center in 1970 is going to find themselves substantially right of center now, even if their views have not changed at all.” – Allen Downey ([53:19])
Downey emphasizes the need for greater data literacy across society. He advocates for educational reforms—replacing advanced calculus with statistics for broader accessibility—and praises the rise of data journalism for promoting critical thinking about data in public discourse.
"There’s a lot of the math curriculum, especially calculus, that is not the most important thing for people to learn. And replacing especially calculus with data literacy, I think would go a long way." – Allen Downey ([61:10])
The episode is a rich resource for understanding why and how to read between the numbers—and why doing so is critical for better personal, professional, and policy decisions.