
Loading summary
A
Foreign. Welcome to the Analytics Power Hour. Analytics topics covered conversationally and sometimes with explicit language. Hey, everybody, welcome.
B
It's the Analytics Power Hour. This is episode 285. You know, for some reason, I have always been Bayesian when it comes to statistics. I didn't arrive there on purpose. It was just sort of intuitive. I also wasn't paying super close attention to my priors along the way. So who knows really how I ended up this way. And on this episode, we're going to do something incredibly brave and slightly reckless. We're going to talk about Bayesian statistics without scaring anyone who doesn't have a PhD. And if you do have a PhD, keep a stress ball handy. It might get turbulent. But before we get into it, let me introduce my co hosts, Mo Kiss. How are you?
C
I'm good, thanks for asking.
B
Nice. I'm hoping you'll be my buddy on this episode. And Tim Wilson. How you doing?
D
I'm going great.
B
I've never come right out and asked you, Tim, but are you or have you ever been a frequentist?
D
Beats the hell out of me. I live my life for P values that are less than 0.05.
B
So that sounds like something a frequentist with. Anyways, I don't know.
A
Okay.
B
I'm Michael Helmlich. Well, we needed a guest. Someone whose work we've been appreciating for years now in the area of media mix modeling. Michael Kaminsky is the co CEO of Recast. He's also one of the organizers of the locally optimistic Slack community and he was previously a guest on episode 232 and now he is back once again. Welcome back to the show, Michael.
A
So glad to be back. Thanks for having me. Really excited for this one.
B
So you heard Tim say it. He is a huge fan of P values and hacking them to get the answers he wants. Why? Is he wrong? No, I'm just kidding. We're going to talk about Bayes. So maybe just start in on sort of a little bit of an explainer into Bayesian statistics.
A
Yeah, so this is a great question. So there's a lot of different directions that we can take the conversation. I'll try to do a little bit of an overview of how I think about Bayesian statistics and how this fits into the sort of universe of different types of analytical strategies that people might take. So maybe we'll start with a little bit of history. So people, I think today hear Bayesian statistics and if you came up through college and maybe graduate degrees like I did, Bayesian statistics sounds like a New thing. You maybe didn't learn about this in university. When I took statistics classes and econometrics classes in university, never talked about Bayes, never really talked about Bayes Theorem. And so when I first started hearing about it, I was like, oh, this is some new thing that people invented. But that's not really true. If we think back about the history of probability and statistics, Bayesian statistics is sort of the original type of statistics. It's a very simple mathematical approach to thinking, thinking about how do we calculate probabilities, how do we estimate the probability of something happening in the future. And really, it's the frequentist approach, largely spearheaded by this guy, R.A. fisher, around the turn of the 20th century, that was sort of the new statistics at the time. And because this frequentist approach was very convenient for a lot of important questions that were relevant at the time, it really gained a whole ton of popularity. But Bayesian statistics has been around for a very, very long time. It's the original type of statistics, if we want to call it that, but it has been resurging in popularity in recent years for some very specific technical reasons, which we might get into later. So what is the idea behind Bayesian statistics? The way that I like to think about it is that Bayesians tend to think from simulations. Where what we want to do is in general, as a Bayesian, is we want to build a model that describes the world. And here, model, you can generally think about it as simulation. I want to build a simulation that I think describes the world. What are the rules of the natural universe of the thing that I am trying to model. And then I want to compare the implications of that simulated world with actual data that we observe, and then try to learn about some parameters of interest from comparing the simulated world that I code up, generally in some software programming language with the data. Again, a lot of people, when they're thinking about statistics, are thinking about a B test, they're thinking about regressions. But there's all kinds of statistical models that we can imagine that are way far outside of that. If we think back to the COVID pandemic, a lot of people did really interesting work trying to model the spread of disease. You might have some biological model where you're trying to think about, okay, there's some coefficient of spread and there's some amount of people being treated. And not then what we want to do is we want to see, well, how well does this model fit the data? What would the coefficient of spread have to be in order to explain the pattern of data that we see. That is a very natural Bayesian approach to statistics. We're going to start with some model informed by science, informed by our understanding of the world, and we're going to compare the implications of that model with data that we actually observe and then use that to infer things about some parameter of interest. I'm going to pause there. Hopefully that was a reasonable summary.
C
Can I just jump in there? Because I am going to talk multiple times probably about the wonderful Kazi Kazarkov, because I feel like she has this way of explaining things to me, like I'm a fifth grader, which is perfect. And one of the things she said in explaining a similar, like in a. An effort to explain it, she was like, it's just a best guess. And like, how does that sit with you? She's like, when it comes to Bayesian stats, it's about us making a best guess. So it's like, if that is your best possible guess, why would you use any other guess other than your best guess? Even if, like, there's the possibility it could be wrong, it's still the best guess you've got. And there was something about that that kind of sat with me well, so I think.
A
I think I probably get where Kazi is coming from. I don't think that I would explain it exactly right because I think that that undersells Bayesian statistics a little bit because it's like, oh, it'. Guesswork. And I think a lot of people have this association of Bayesian statistics specifically with the idea of priors. And so they're like, oh, you're just guessing and you're just putting your own beliefs out there and you're not really doing any analysis. And so I would shy away from this, from the idea of a best guess. What I would sort of describe is like, what we want to do as scientists, I'll just say, as scientists in general, is we want to combine all of the information that we have about some phenomenon out in the world and then combine that information in the most rigorous way possible and then call that our best guess. Again, there's a scientific process here that I think is really important. Again, I'm talking about scientists, and this includes biologists. This includes people who are estimating populations of fish in the ocean. There's all kinds of different people who might take this approach. What we're often doing in the process of doing science is trying to collect evidence from out in the world and then combine it into some theory of being able to explain how the world really works. And I think the Bayesian process of sort of building a theoretical model and then comparing that with all of the different evidence that we have and bringing that all together to have some best guess or some understanding of the world, that is a good description of the Bayesian approach to science, really. So I think there is a part where it's a best guess, but it's not just a best guess. Right. It's a strategy. And really a really principled approach to combining information that yields the best guess based on all of that information that we have.
D
So when the contrast between frequentist and Bayesian, and I feel like you started to hit a couple of things, and I will throw in a plug because I thought my first run at trying to understand Bayesian was reading a book that was called the Theory that Would Not Die, How Bayes Rule. Bayes Rule, cracked the Enigma code, hunted down Russian submarines, and emerged triumphant from two centuries of controversy. And what you said was kind of like the one main thing I took away from it was, this is not new. And it's called Bayes. It's named after Bayes theorem from Thomas Bayes from however long ago. But you started to say that what Fisher came up with that is kind of, I guess, sort of the frequentist world is it that it's kind of computationally simpler but relies on. For frequency statistics to work, the shortcuts or the less computationally intensive aspects of that approach do rely on some underlying assumptions being true that in the past, when you go that route, you say, well, I don't have a strong enough reason to think they're not true, so I'm going to accept them as true. And then kind of calculate. I mean, I think it is worth talking about what the frequentist world misses. And part of what you had were leaning towards was some technological or computationally, one is more computationally intensive than the other. Yeah, I know it didn't end with a question, but I'm kind of confused.
A
Let me try to restate some of this and then I'll expand on it a little bit. So R.A. fisher developed a lot of these sort of really powerful statistical methods to solve some of the problems that were floating around at the time that he was working again at sort of early 1900s. And a lot of the problems that people were facing statistically that were in the field of agriculture. And so it was either like, hey, we have this field full of, I don't know, watermelons, and we treated each of these different plots differently. And we want to be able to take a sample of these watermelons and then try to determine if plot A versus plot B versus plot C which one is yielding the bigger melons. That is sort of like the type of analysis they were doing. Or classically, you get a shipment of grain to the Guinness factory, you want to be able to take a sample of that grain and then use that to guess about the overall quality of the full shipment, obviously without having to analyze every grain of whatever wheat that is coming in that shipment. And so they developed a lot of techniques for analyzing from small samples and then generalizing to a population. And the frequentist sort of body of work often or pretty much always takes that as the philosophical basis and even the mathematical basis of what they're trying to do is they're saying, okay, there's this population that is very big that we can't measure all of. And so we're going to take samples from that and then we're going to try to generalize from those samples to the population. The frequentist statistics work really, really well when you are in that scenario, when you are taking random samples from a large population. Frequentist statistics work very, very well for doing. I mean, and again, I use frequentist statistics a lot. I've published papers using frequentist statistics. I am not a zealot for Bayes in any sense. These methods work especially well in this context. A B testing, you're testing feature A versus feature B. You're going to randomly draw from your sample of visitors to your website. Some are going to get a, some are going to be perfectly aligned with the frequentist approach and the frequentist model. And so those statistics that works really well, there's a lot of baked in assumptions there, right? You're getting a random sample, there is this population. All of those questions tend to line up really nicely. But there's a lot of other questions that we might ask that don't really fit naturally into that framework where we aren't necessarily drawing samples. Maybe we actually observe all of the data. There is no sample. It's just like we have all of the data. I work a lot in marketing analytics. We have all of the data on marketing that we have. There's no sample, there's no other universe. Frequent statistics starts to get weird and you often run into this. People are like, okay, help me interpret this P value. If you ran this analysis 1,000, thousand times, what would the distribution of results be? And you're like, wait, but I have the code for the analysis. The data doesn't change because I have the universe. What do you mean, run it a thousand times? It doesn't really make sense. And so there's lots of questions that we might ask as a business that don't really naturally fall into this idea of drawing a sample from a population. Maybe we want to predict how much revenue this business is going to do next quarter. What's the population and what's the sample there that we would be analyzing? All of a sudden it's like, okay, well what are we actually trying to do here?
D
Can I ask? I remember my brain melting a little bit when that whole sample versus population, specifically when it comes to time, when you can only sample the past to the present, if you're determining your population is all time or the next year, isn't that definitionally you don't get a representative sample or it's not a, you're getting a sample that's from one set of times. You can't go and pull some of your sample data from the future. Right.
C
So wait, sorry. The sample bit is definitely something that is turning around in my mind too because so have I got it correct that in a frequentist world you basically would need to be confident that you have a fully representative sample to then make a decision about the action you're going to change versus like, because there's a very high likelihood that you, like, you have a shitty sample. Right. And so is that part.
A
A lot of the frequentist math is about determining what sample size do you need to believe that it is representative. Right. That's what a lot of like the machinery behind the statistics is actually sort of doing for you. Right. The P value has baked into it this idea of, well, how big is the sample? Like it's, there's an n minus one divisor somewhere in the formula and that's controlling for okay, well how big do we think the population is? How big is the sample size? And so it's trying to do some of that machinery for you. But you're right, that does matter. And often a core assumption of a lot of frequentist methods is that you have this sort of random sample from the true population. Often it is the case that we don't have that. And then again the sort of frequentist paradigm, again people who are like arch Frequentists or have Ph.D. in statistics are going to quibble with this. But often I feel like the frequentist paradigm starts to really break down or get very confusing once you get into this world where it's like, okay, we have a non random sample now what do we do with it, and that's a place where it's like all of a sudden it's like, oh, all of these tools that we have no longer really apply, all of the frequency rules we have no longer really apply. But it is a place where you can start doing Bayesian modeling very well. You need to account for that non randomness. You need to have some story that you can program into this simulated world about how your sample came to be non randomly, but you can still make forward progress there very easily. You just simulate what would happen, okay, here's what it would look like, here's the implications of that, and you can sort of go on your merry way. And that I think is one of the really powerful tools of the Bayesian approach is that you don't, you can sort of start just from what is my model of the world and then work from there. You don't have to be like, okay, do I have a random sample from a population? And then what do I do with that? And so it opens up the aperture on the types of analyses you can do and the types of questions that you can answer.
D
So can you maybe come up with, I don't know, like a baseball analogy to explain? So this is a, it's like such a good example if you follow baseball and.
A
Sorry, yeah, so Tim is teeing this up because I sent over some notes via too many notes via email. And this was one of the examples. This was one of the examples that resonated with me very early on in my journey, actually. I'll give you all some insight into biography. When I first heard about Bayesian statistics, I was incredibly skeptical. I was like, what is this prior stuff? This all sounds like bs. It sounds like you're just putting your thumb on the scale. I trust my statistical significance. I don't want to hear about all of your priors. And so it went on a long journey to start to really understand and appreciate the Bayesian approach. And this example that someone shared with me, I don't remember the original source really resonated with me. And so the story is that imagine that you're at a baseball game, it's called opening day. So again, apologies to everyone who doesn't follow baseball. Opening day is the first game of the season and you watch some batter come up to bat. Batters generally bat about four times in one game. And you see this batter get a hit four out of four times, right? So they are batting an average of 1000 on this opening game. And then I say to you, I say, okay, well what do you think Tim knows baseball? So, Tim, what do you think this batter's average is going to be at the end of the season, given that we just saw this batter go four for four batting 1,000 with this at opening day.
B
So first, is this MLB or Savannah bananas?
A
Let's call it. Let's say it MLB for now.
B
Yeah. Okay. MLB. Okay.
D
Well, now. So here's. I don't even know whether I'm winding up giving a intuitive or Bayesian or a frequentist answer, but I would say. Well, that sample size is so small, I'm going to say that I think it's probably 300 or 320.
A
Okay, 300. How'd you get to 300?
D
Because 300 would be a good, solid, respectable from this early viewing, it was way better than that. But it's only four at bats, and it's a long season, and I. There will be some regression to.
A
Yeah, but why 300? Like, where did 300 come from? Why does that not jump to your mind?
D
That's like, to me, not being a super, super close follower of baseball, that's like a number that I know is, like, kind of normal. Maybe. 250 or 300 is standard. It's the average.
B
Yeah.
A
Okay. So just to like, sort of talk through this logic, the idea here is we just saw this batter do really well. Like, four hits in a game is really, really good. So we think this batter is probably a good batter. But, you know, we've only seen one game, and based on our other knowledge of baseball, that batting.300 for a season or for a career is really, really good. That's hall of fame. So 250 in general, almost every batter is going to end the season between.200 and.350 as an average. Overall, this is 0.2 to 0.35. Again, apologies to those who don't follow baseball. So what Tim did here is Tim used his Bayesian reasoning to get this idea. We have a little bit of data, and we want to combine that with our prior knowledge of how baseball players normally perform. We know that pretty much everybody's going to land between 200 and. 250 in the season. This player seems pretty good. So maybe at the top end of that range. 250, 300, maybe a little bit better, but that's probably where he's going to end up. That reasoning is a Bayesian approach. We have prior information about, in general, where people land. We observe a little bit of data. We combine that together in our mind, and we come up with some range that we think is reasonable based on that combination of evidence that we have. So that's like the Bayesian approach to statistics is very intuitive for humans. Humans do Bayesian reasoning constantly. You have prior expectations about what's going to happen and then you observe evidence and then you update that in your mind. And your mind is constantly doing this all of the time. It's not necessarily applying Bayes theorem exactly, but it's doing this process of combining evidence together to come to some conclusion. And then the apparatus of Bayesian statistics that we have gives us a very formal, very precise way of doing that reasonably efficiently. Sorry, go ahead.
C
It is. So I've always boil things down to a tldr, aren't I? It's updating your beliefs. Right. That is the most simple way I try and rationalize this.
A
That's exactly right.
C
Coming to the business. So. Which I feel like I'm always trying to do. So you're working with your stakeholders. You feel like you've done a really good job of being like, these are the, this is the way that I have come up. Like the assumptions that I have made. These are the beliefs that I have. And then over time they adjust. What does that process look like with your business stakeholders? Because I presume so we talked about this a little bit earlier about the sample, but like, the quality of the assumptions you make also are based on your knowledge of that particular area. I know shit about baseball. So listening to that, I was like, I'd probably make my assumptions would be pretty shit versus, like someone who I don't know works in data and baseball, like theirs would be much better. So the same would be true if you're working with a business stakeholder on a problem. Like they might have business expertise that could help make better estimates to start. So how involved or not involved do you think the business stakeholders should be if you're using a Bayesian approach to solving a problem?
A
I think ideally very involved. Again, I talk about how do we do science? And if you're a scientist, you should start from all of the theory. Right? You're going to go read all of the papers. You're going to go learn deeply about how other people have studied this particular phenomenon. And that is going to give you sort of the best starting point for learning something new about the world. And in business, we want to do the same thing as analysts, as scientists. We want to start by talking to the domain experts. We want to learn about how do you think about how this system works, what are your expectations? How do you know that then once they share that, you can use that information to. Again, sometimes you can't improve on it. You're like, you clearly know what you're talking about. I can't help here. Good luck. But sometimes we can take that information and combine it with other data in smart ways to generate new insights. But I would say as a scientist, again, we want to be talking to the people who actually know how the system works. That's what's going to allow us to build better structural models, this idea of a simulation. And it's going to allow us to better incorporate their knowledge explicitly and directly into this model. If this person says, look, there's no way that this effect that you're talking about is larger than X, well, great, I can put a bound on my model. And that really helps us to get more precision on the other estimates. If we know that it can't be X, it might save us a bunch of time. Again, this Bayesian process, even for, again, quote, unquote, frequentists, often happens very naturally. Many of you all might have been in this situation where you've been tasked with doing some analysis. Maybe you're running a regression to try to get some answer. And you run your regression and you get a number and you look at the number and you're like, that can't possibly be true. It just doesn't make any sense. So you throw it out. And then you keep tweaking the model, you tweak the model, you tweak the model, you throw a bunch out. Finally you get to a result like, okay, yeah, this sort of makes sense. That is a Bayesian process. You have priors about what you expect the results are going to be. You're throwing out a huge number of potential analyses, but you're just doing it based on your judgment. You're actually doing this in a bad way. A frequentist looks at that process and is going to say like, oh, no, that P value you have doesn't mean what you think it means because there's actually a ton of other comparisons that you just did. And so that process, which again, all analysts do, because we look at the results and we think about them, we say, does this match our expectations about what these results should look like? You throw out all the results that don't, and you're only left with the ones that are. The Bayesian system is just a more formalized, honestly more rigorous approach to that same process.
B
It seems like to be a good poker player, you need to be a Bayesian.
A
Poker players tend to be good Patients, they tend to think this way in terms of. Because, again, you think about, like a poker hand. Right. Or again, at least, what is the. What is the popular type of poker that people play right now?
B
Texas Hold'.
D
Em.
A
Texas Hold'. Em, Right. Texas Hold'.
C
Em.
A
You have some information that you have about what cards you have. You see, you know, cards come out one at a time, and based on as the cards come out, you have to sort of update your belief about, is my hand a winner or not? Or is my opponent's hand a winner or not? It's a. It's, again, it's. This, is this process of updating our beliefs about what is likely to happen in future states on the world as new information comes in. And that, again, is a very Bayesian approach to just thinking about what's going to happen in the world.
B
But there's a big difference in priors between a professional and a.
A
You know, that's exactly right. Professionals will have a lot. They'll have a much better model and a lot more confidence about. They'll just know the numbers a lot better. It's actually like, yeah, that's actually a less, maybe not as good of an example on priors because there's sort of like a mathematically correct answer. In the case of poker, there's a.
B
Bunch of probabilities they're doing as well.
A
Yeah, but like, the sports analogy is maybe a good one or like a really good scout or like a recruiter for some sport. And again, this applies to soccer or football or whatever. Aussie, Aussie, rugby, something. Yeah, that, like, those scouts might have much better expectations on how well this player is going to perform because they've seen so many others, they've studied it so much, they're going to have a much better prior on the future career trajectory of some player than I will, because I don't know as much about that sport and I'm not a trained scout. And so experts can have a lot better guesses about future outcomes and a lot stronger beliefs because of all of that expertise that they have.
D
So when you say so, I think there's the priors in taking and updating your beliefs. But you also keep saying simulations. And I keep thinking, okay, if I'm going to go back to the baseball example, if it was, if I have knowledge that look between 200 and 350, that's going to be where batters wind up being. I'm going to run, I'm going to build a model where I have batters and take 4, 44 at bats and see how many of those as I range them with their overall batting average between.200 and.350 and see how often I come up with four hits in a row. Is there literally simulations like you're picking?
A
That is literally how I would do it. That is literally how what you just described is literally how I would solve this problem. I am not very good at traditional equations based math. There probably is a shortcut to coming up with this answer that someone who is really good at math and probability could write down. I would just simulate it. I'd do exactly what you said. I'd have a population of hitters that would have a range of overall batting averages. And then I'd run the simulation and see how often do we observe a person getting four hits in one game and then see how that relates to the range of batting averages they end up with. It is an analytical way to generate an answer to this question that is in some ways very computationally intensive. Like running that simulation. It's going to be a pain. It's going to take a few hours. It's going to code it up, it's going to take a few hours to run on the computer. But it will give you the right answer every single time. And you don't really need to know any statistics to get the right answer. You just simulate. And that is. Go ahead, Mo. No. Oh, God. Oh. I was just gonna say that is what I like about Bayesian statistics is that you don't need to memorize all of the different rules. Right? You don't need to memorize, oh, what are the assumptions of the linear regression. And oh, am I going to run this check and that check and this check to make sure. It's just like we're just going to simulate. We're just going to simulate from what we believe based on the structure of the problem that we're analyzing. We're going to compare it to the data that we observed and we're going to count. And so it's a simulation and it's a counting problem. Again, this goes back to. Why is this becoming more popular now? It's becoming more popular now because we have more powerful computers that can do more, bigger simulations, faster. If you wanted to answer one of these Fairly Complex Problems 20 years ago, you just couldn't do enough simulations. It would take way too long. The computers weren't powerful enough. The algorithms for doing these simulations efficiently weren't powerful enough. Now they're getting a lot more powerful in terms of computers and in terms of the algorithms that we can do to explore different parameter spaces. And that allows us to answer these problems again, where we don't have to learn any rules, we just simulate and then count. And we put way more effort on the computer and way less effort on the theory and the assumptions that go into these shortcuts that can give you really nice answers in the frequentist paradigm, but aren't often as flexible as just the straight simulation approach.
C
I want to talk about the Enigma machine because it's something that I'm mildly fascinated about. I personally am going to be very open that, like, this is. I'm making more of an effort to learn more about stats. This is something that I want to do over the next period of time so I can attempt to explain it and we'll probably fuck it up. Or if someone else feels more confident in trying to explain it, I am also happy with that. But this is me live learning. So you guys let me know whether we're going to go with option A.
B
Or option B. I say take it away, Mo.
A
Yeah, dude, go for it, Mo. I'm ready.
B
Because I'm pretty sure I couldn't.
C
So Enigma machine used in World War II by the Germans. They thought it was unbreakable because there was like, basically so many different variations. Of course, very famously, Bletchley park, they broke the code. There were some great men involved also, many, many, many amazing women, which is why I know lots about it. Like I said, I'm going to do my best to try and explain it. My understanding is that the machine A could never equal A in the machine, right? So, like, you know, A can't equal A, B can't equal B, et cetera. So you have like a series of beliefs, rules that, like, is your starting point. And then they had cribs. I don't remember exactly the cribs, but basically they would go in and they would make a guess using what was like a common word from the message. So like deer or like kernel, or like, I'm trying to think. Or if it was a weather report, it might be like rain. And so they know what the output is they're looking for. So they would try and make like a best guess of. I think that if I tried to put B here and we're trying to spell the letter rain, then the next letter, which is A, should be this. And like. And it was basically like them testing out a series of these different things. And if they got one letter, then they would go down that. I think they called them crib. And that was like, basically how they ended up breaking the code. But they would have to do that again every single day because they change the codes every day. Is that like a. This is one of those things that they constantly come back to. Like, this is an example of Bayesian. A Bayesian approach.
A
So it sounds Bayesian to me. Right. Because basically. So let me again, try to just redescribe in my words, what you're describing. I am not an expert on this, but, like, the intuition behind this makes sense, is we're going to. So we have some model of how, like, I'm. We're English, Right. We have some model of how German works, of the frequency of different words, the frequency of different letter pairings, certain common words, certain common patterns that we're going to look for. And then what we're going to do is we're going to throw a ton of computation at the problem and we're just going to say, look, we're going to run all of the different combinations as fast as we can, and we're going to see, based on the expected distribution of certain, again, letter combinations under a bunch of different assumptions. These are the parameters in the model. How often do these expected combinations come up under these different parameters? And that lets us back into the idea of, okay, well, the patterns work out such that this is probably the cipher. And so, again, I don't even actually really know exactly how you would say, oh, this is Bayes theorem applied here. But this idea of we're just going to look for patterns and we're going to run all of the compute and count the things up, and that's going to give us some indicator of what's most likely. That, again, is just. It's a very Bayesian feel to the type of analysis that we're talking about.
C
Okay, so everyone has homework. But my understanding is they did start with people doing this and then the Enigma machine ended up taking over.
D
Yeah, I mean, that's in the title of the book. I don't remember the details of the one that I was. That somehow Bayes ruled help crack the Enigma code. But you gave them. I don't remember how they said that that worked.
A
There's probably something to it where it's like, because the total number of combinations is too big, and so you have to smartly guess what is the next combination that you're going to check what is most likely based on the information on the things that we have done so far. And so my guess is that thinking about what is the next combination of cipher Wheels are we going to try? That is most likely, given the information that we have. And then based on the results of that, we'll have some new best guess about what the next best guess to try is. And then we're going to do that. And that allows you to not have to run all whatever 7 trillion versions there are because you want to have the smart. Like, you can't run all 7 trillion. And so you say, okay, we have to do the ones where we have the best guess first. And the Bayes theorem would give you a principled way to think about what's the next best guess based on everything that we've observed so far.
C
Okay, so now to go down another complete rabbit hole, because that's what I like to do is like, what are all the questions that I have on this topic? Let's talk a little bit more about bias, because I feel like, is that where everyone gets into, like, tiffs about this? Where, like, the Bayesians think frequentists have, like, more bias issues when it comes to selection, and then the frequenters think the bias is to do with the beliefs that the Bayesians are making? Like, is that the crux of bias debate? Or is there. I feel like you're going to explain it in a much smarter way than I probably have.
A
I'll try to jump in with at least what I have seen. And this is also part of my personal journey. I talked about sort of, I had this similar belief about Bayesians when I was first learning about it, where I was like, oh, you can just use priors to make the model say whatever you want. And I was like, that feels wrong. Right? That's not science. You're just using your priors to make the model say whatever you want. And what I learned is that Bayesians, the response to that is they say, yes, that is true. I can use my priors to make the model say anything, just the same way that you can run the regression a thousand times and just select the version that you want to use. The problem is that there is no way that we can design an analysis procedure that will prevent an analyst from shaping the results. The analyst has the power to shape the results always. There is no procedure that eliminates the possibility of analyst bias. If we think that we are working with an analyst who is going to try to trick us, there's basically nothing we can do, Right? The P value is not going to prevent us from the analyst tricking us. Bayesian approach not going to prevent us from the analyst tricking us. There is no way to get around that. The analyst can always bias the results. And again, those of you who are data scientists, who have worked in academia or in industry, you probably feel this in your bones, right? You have run analyses different ways, and you just know that I can always run an analysis and make the data say anything I want. It's always possible. I'm very good at it. I can do it so that again, there is no procedure that will prevent that. And so if you're in a world where you're like, we have to figure out the procedure that's going to prevent that, Throw it out, Never going to happen. Once we're in a world where we're like, okay, we're interested in finding the truth together, then we want to start thinking about, well, what's the best way to do that? And it's totally correct that yes, an analyst could bias the results with the priors, but a part of analyzing a Bayesian analysis is also looking at the priors. The priors shouldn't ever be hidden. When I am reading the results of a scientific paper, whether it's using Bayesian analysis or frequentist approach, what I'm going to be looking for is, well, what are the assumptions that's being made? Exactly? What is the structure of the model being used here? How is that combining with the data that they have and generating these results? And the priors should always be a part of that. If you read an academic paper where people are taking a Bayesian approach, they're always going to share their priors. That is just part of understanding a Bayesian analysis. And what I tell people is that an analysis is an argument. It's not truth. And an argument is up to the reader or the person sort of hearing it to make judgments. Is it a good argument or is it a bad argument? And if I see someone doing an analysis and they're taking a Bayesian approach and I don't agree with their priors, I'm like, those priors don't make any sense. It's going to be a very not very compelling argument. I'm going to say, then I'm not going to buy your results. The results aren't going to be convincing to me because I don't think your priors make any sense. But good analysts or good Bayesians, they're going to try to justify their priors. They're going to say, look, these are the assumptions that I'm making. Here's what it's based on. It's based on these other experiments. And These other papers and this theory, combine that with the data that we have, it's going to be limited in some way. And this is the implication on the other side once you go through that process. And maybe that's very compelling. Maybe it's not depending on exactly what assumptions are made, but it's the totality taken together that makes it a compelling argument. Or not. Not does it have two or three asterisks in table number two?
C
What I don't get is, can't frequenters just solve this by also being more open about, like, potential misinterpretations or reasons that I don't know. They might have run the same test 20 times, but only taken one result? The fact that they took the results early, Any of the ways that, like, you're typically P hacking, like, couldn't they just have the same thing where they're like, these are the explicit ways we could have made an error in this analysis and then wouldn't that just, like, be, like, great? Now both methods are both great and we use one depending on which one's better for the problem we're trying to solve.
A
I think the really good scientists do that, and that's what we see. The good scientists do exactly that, and it's great. I love that. Again, I am not a zealot on one side or the other. I think the good scientists do that. They admit where there is risk, where certain assumptions are potentially biasing the results. Really good scientists, again, both frequentist and Bayesian analyses, they'll run sensitivities and they'll say, if this assumption that I made was different, if the priors were different, how much does that affect the results? If this assumption is different, how much does that affect the results? That's the sort of stuff that you really want to see, to understand, and that's what makes for a very compelling argument. The problem comes when people think that a P value of less than 0.05 means that something is truth. And that's, again, I think we all. There's growing awareness that that's not true. People listening to this podcast probably know that that's not. So there's growing awareness of that. But if we're taking sort of like an honest approach to science, we need to recognize that all of these different assumptions that we make are impacting the final results. We need to be transparent about what those are. We need to, in the interest of, again, getting towards truth, the scientific process, explore the implications of those different assumptions. And if we do that, we're going to end up closer to the truth. And again now all of a sudden there's not such a difference between this Bayesian approach and the, the frequentist approach. Right. If we're doing this all very transparently, we're going to end up probably in a very similar world.
D
I think that's as you were saying that I remember Matt Gershoff saying that at some point. I want to say Chelsea Parvat Pelaridi said it as well. The people who are genuinely knowledgeable, if you have enough data and you put a similar amount of rigor on it and you have that, you're going to wind up coming up with the same answer. You don't hear people saying, well, the Bayesian says that the truth is X and the rigorous frequentist says that the truth is Y.1. Neither one of them is declaring a truth and the other is they should probably be getting at results that even if what they're saying is a little bit different, neither one would contradict the other if they're both done rigorously and well. And maybe I'll add on to that. Frequentists have P values and confidence intervals, but Bayesians have credible intervals.
A
And.
D
Does that sort of tie in that if you've got a credible interval, you've done this Bayesian approach, you've got a confidence interval, you've done this other approach. They're not the same thing, but they're both kind of pointing to the level of uncertainty. And if you're doing a rigorous job, you're going to be indicating an appropriate breadth of uncertainty in whatever conclusion you've arrived at.
A
Yeah, so I think that that's fair.
D
That just.
A
It's mostly fair. The things that I would add onto it is that almost always you can get to the same frequentist results with a Bayesian approach. There's a few exceptions, but almost always you can back in. And if you give me some frequentist analysis that you've done, I can write those same assumptions down using priors and model structures and get effectively the same results. The wrinkle is that you can't always go the other way. There are questions that you can answer with a Bayesian approach that are just really not possible with a frequentist approach because of the limitations of the specific assumptions that are being made and the specific mechanics there. Again, because the simulation based approach is effectively infinitely flexible. If you can simulate it, you can generate a statistical analysis with it. And that means that there are more questions that we can answer with a Bayesian approach that can't be answered with the Frequentist approach.
C
I really want to be a jerk and ask for an example.
A
Great question. So, as an example of a type of analysis that I think is very difficult to do in a frequentist analysis, this comes from Richard McKelrith. He is an amazing statistics educator. A lot of my knowledge, even some of the explanations that I'm sharing on this podcast come from him. Everybody should check out his book, Statistical Rethinking, and then his online YouTube course. So he gave this example, which I think is really interesting. It's technical and sort of in the weeds. So apologies for that. But it's sort of the best example that I have coming to my mind. And this is an example of trying to analyze multilevel models. And so we could imagine you see this a lot in education econometrics, where you have a model where you have the school level effects and then you have classroom level effects, and then you have individual student level effects and you're trying to estimate, okay, if we teach phonics versus if we don't teach phonics, how does that happen? Often we want to estimate all of those different effects, student level, classroom level, school level. And in frequentist approaches, it can be often very difficult to estimate all of those simultaneously. And so frequentists invented this sort of machinery to estimate these multilevel models called munlach machines. And they're sort of weird. And again, it requires a bunch of assumptions under the hood and it gives okay results. It sort of works. But the Bayesian approach, we just simulate the whole thing, right? We simulate the students, we simulate the classrooms, we simulate the schools, and we can estimate all of those parameters jointly, no problem, zero issue at all. This is in chapter 12 of Statistical Rethinking on Multi Level Models. Richard McElworth goes into the details of exactly how this works. And we can just show there's no real straightforward way to estimate all of these different parameters simultaneously. With a frequentist approach, we simulate the whole thing, use a lot more computing power to get there. It takes 30 minutes to run, as opposed to 30 seconds, but we get the answer that we want on the other side. That's one example. And there are a bunch of others of just questions that, that aren't even really well formulated in a frequentist analysis, where a frequentist would be like, that question doesn't make sense to me. But in a Bayesian approach, we can just write it down. We can say, okay, look, we're just going to have this model and we're going to simulate and we're going to get A result on the other side. Another good example that Mo flagged while we were on break is very small data. Another thing that Professor Richard McElworth likes to say is that the minimum sample size of a Bayesian analysis is zero samples. You can do a totally reasonable Bayesian analysis with zero samples with no data at all. We're just going to simulate and we're going to see what happens. If I have these beliefs, this is what the implications of those beliefs are. That is a totally reasonable Bayesian analysis is no data at all. We're just going to simulate from our beliefs, from the papers that we've read. If this is true, if this is true, if this is true, if this is the structural model of how the world works, this is what the implications are. That's a very reasonable thing to do as a patient. If you have one data point, you can see how well that aligns with that. If you have four data points, you can see how well those four data points align with that analysis and you can move from there. But the Bayesian approach allows you to really or actually encourages, which is the thing that I really like. It encourages you to think from the physical system or the biological system, or what a Bayesian would call the data generating process. How does the world work such that it would generate these data and then work from there? As opposed to starting from a data set and being like, how can I run a regression that will get a statistically significant P value? Which I think is the wrong way to do science. You want to start from the science. What do we know about the world? Compare the theory to the data. As opposed to being like, oh, I found this data set. How can I get a statistically significant P value out of it?
D
But that does seem like that is a heavy lift when working with the business counterparts. I mean, it's the battle I feel like we've been fighting for 20 years in that we have taught the world of business that the data is this objective, quantitative thing and you just have to pick the right model as opposed to saying, no, no, no. You should be thinking about your beliefs, your expectations, your knowledge, what you've seen, what just you've theorized. You should do all of that first and then see like how well the data matches up with that and then adjust it. Right. I mean, Mo, it kind of gets back, I think, to your question earlier when you were bringing up like working with a business user, all of a sudden you're going to them saying, I need to tap into your knowledge and they'll and they are saying, why? Like my knowledge, I'm just a marketer. Why can't you just have the data give me the answer? Which I think, Michael, is where you were just landing up. Because that's absolute lunacy if you actually step back and think about it because you're leaving information on the cutting room floor.
A
I mean, I hear both sides, right? Like, I think I talk to a lot of business people who complain about their data science team and they're like, why does the data science team never ask me about this section of the business that I know everything about? They come in with some presentation that has a bunch of results that clearly don't make any sense. They should have come and talked to me. So I think you get both sides. I do think that there have been some executives that have been trained into the cult of statistical significance, unfortunately, and they sort of of want to turn off their brain and just allow the data to speak to them. But I actually think that if you look at especially really successful executives in practice, they don't do that. They actually are really carefully weighing the different pieces of evidence that they have. And if they see a fourth year analyst come in with some presentation and some analysis and they're like, that doesn't match my expectations at all. They're going to poke a bunch of holes in it very quickly. And that again, is the sort of Bayesian process at work here. Being like, this analysis doesn't match all of my beliefs, like what could be going wrong with it? And sometimes analysts get frustrated by that and they're like, oh, this executive doesn't listen to the data. But actually, in my experience is that nine times out of 10, the executives are right and it's the analysts that didn't understand something or did the analysis wrong or missed some crucial assumptions. And I think that that's actually like a reasonable process.
C
Can I walk through a real situation that is bubbling in my mind? Okay, so you have a new a product team, they want to build a new feature. We have all been like auto trained.
D
Feel free to use specific names if you want. We're not recording this, Mo. Go ahead.
C
The best way to do that is always through an A B test because, you know, that's the gold standard and we want to have as much confidence as possible in the new feature, yada, yada yada. What I'm picking up from this conversation is that you could potentially, even if you haven't built or yet rolled out the feature, you could use a Bayesian approach to have a better understanding, to how your users might respond to it. That might help you then decide whether you should do feature A or feature B or feature C. But that doesn't like, I feel like working with product folk. They still come back to the, like, we need an experiment. It has to be done through an experiment. Like, and the thing that I'm kind of reconciling or rolling around in my head right now is you could get to a decision a lot faster with a different approach. That might be good enough. But then let's say we don't have any data because the feature hasn't been built. Your expectations about that feature are going to be based on previous features and how they performed. Is it perfect? Are you not just going to potentially then end up with a very average assumption of how it might perform, which is potentially not reflective of a new feature?
A
Okay, so I think this is a really interesting topic and there's a bunch of different directions that we can go. And I would say that everything you're describing sounds very reasonable. And what I would say is that you should design your decision making approach based on what you're trying to achieve as a business. And this again is another place where I think Bayesians do really well. Bayesians care a lot about decision making. How do we make good decisions and how do we weigh the trade offs of different decisions? And this again is one, they're obsessed with uncertainty. And this is one of the reasons is because you can take the full distribution of your estimated parameters and then plug that through some decision making process and then evaluate on the other side, like how much would we regret making a wrong decision? How much profit is this going to generate on average? And the framework allows you to do that in a really nice and convenient way. And so that's one nice thing, if I was giving advice to you or to this product team, is I would say set up your decision making process to optimize for whatever it is, the thing that you care about. You could imagine a product team saying, look, all that we care about is not making a huge mistake. And so we're going to design tests that are only going to rule out huge mistakes. We're probably going to release a bunch of stuff that isn't actually better than the old stuff. That's fine, we're fine with that. We're going to design our decision making procedure and again, we can set up our Bayesian framework to do this, such that we run a lot of very small tests and we're going to power these tests just to be able to rule out the very bad errors and everything else is just going to go through. You could have a different decision making procedure that goes the opposite direction. It's like we only care about winners. Everything that isn't a huge winner we're going to throw out. And you can flexibly design your statistical analysis of these different experiments that you're going to run to optimize for whatever it is the thing that you care about. And that's all again, I think totally reasonable. We don't have to have the perfect AB test 5050 split, statistically significant. We can do a bunch of other stuff that isn't that and still have a very reasonable business outcome. But we need to use our business judgment to decide what that is and then we can make the statistics do whatever we want. Another good place. This is another good example of where Bayesian statistics can be a nice add on is we know in A B tests there's this problem of the winner's curse, which is to say that businesses that run lots of A B tests will get a bunch of winners that actually aren't very good. And there's a bunch of reasons why this happens. But it has to do with eventually if you run enough tests and you only select the statistically significant ones of those, a bunch of those aren't going to be actual true winners again because of false positive rates. This should be fairly intuitive. You can apply a Bayesian prior to your test results to account for that and say, look, we're going to actually make sure that tests get even further. There's even more signal than what would be implied by a normal traditional one off a B test to try to account for the winner's curse. That's again very reasonable thing to do. You can make that as strong or as weak as you want, but it allows you to take your information from previous tests, from your own business judgment, from the cost of releasing something bad, and then pull that together into one single decision. The Bayesian approach gives you a nice framework for being able to do that.
B
All right, we do have to start to wrap up. Who could have predicted that that would happen? This is good, really good. My mind is going a lot faster than it was when we started and so I'm thinking about a lot of different applications. So this is awesome. Thank you, Michael. And now it's time for a quick break with our friend Michael Kaminski. Think of that. From Recast, they are the media mix modeling and geolift platform, helping teams forecast accurately and make better decisions. I can't wait to See what you have to share with us this time, which is a marketing science bite sized lesson to help us all measure smarter. Take it away, Michael. And this is pre recorded.
A
Michael.
C
Foreign.
A
Causality is a complex topic with deep philosophical roots. Determining or proving causality is notoriously difficult. And in the context of analytics, there's lots of confusion about what we mean by causality at all and how to determine if we found it. Because of the correlation versus causation problem, simply being able to predict what will happen in some system is not enough to demonstrate causality. That's because if the system is stable, you can use correlations to make very good prediction. So predictions alone are not good indications of a good causal model. To demonstrate that our model is causal, we have to be able to forecast accurately when there's an intervention. This is the core of the philosophy of causality in the sciences. In the chemistry lab, if we can consistently predict what will happen when mixing different compounds together, we likely have a good understanding of the underlying causal mechanics. And in business, we should think the same way. We can demonstrate that we understand causality when we can intervene on some system and still predict what will happen. By combining research and analysis with taking action, we can validate our understanding of causality and actually make our business better at the same time. Remember, you can't demonstrate causality from the armchair. You have to make a falsifiable prediction and then actually go out in the world and take action.
B
All right, thanks Michael. And for those who haven't heard, our friends at Recast just launched their new incrementality testing platform, Geolift by Recast. It's a simple, powerful way for marketing and data teams to measure the true impact of their advertising spend. And even better, you can use it completely free for six months. Just visit getrecast.com geolift that's getrecast.com geolIFT to start your trial today. All right, well, one of the things we love to do on the show is do a last call. Something that we shared might be of interest to our audience. Michael, you're our guest. Do you have a last call you'd like to share?
A
Sure. I am reading this book currently called the Maniac by Benjamin Labatute. It's very bizarre but maybe of interest to people who listen to this podcast. It's sort of a biography, fictionalized biography of Johnny von Neumann who was one of the original inventors of the computer. Spent a lot of time not necessarily working on cracking the Enigma code, but but working with the scientists that were doing that, developed A lot of amazing science and mathematics and then also sort of a meditation on AI and sort of the trajectory of the world as computing power explodes. So very sort of bizarre literary fiction biography thing. But if that sounds good to you, I highly recommend checking it out.
B
Very cool. All right, Mo, what about you? What's your last call?
C
I have two, but they are both very show related so that feels appropriate. The first one is Kazi Kazarkov has an article Statistics for People in a Hurry. It leans very heavily on the frequentist side, but the the favorite thing in my life at the moment is that she also does an audio version where she reads out the the article she's written which I adore because she is a very witty human and so getting to hear her read it, you can definitely appreciate the wit a bit more. But also just on the Enigma co if you want to read something for fun, it is not a science book. It is, I think they call it historical fiction. So it's based largely on a lot of historical events, but it's totally fiction with maybe some characters inspired by people in history. One that I really enjoyed was called the Rose Code, which was about the Enigma and the women who worked at Bletchley Park. The author is Kate Quinn and she does these amazing historical fiction books that are just, just actually enjoyable to read. But you get to know, I don't know a little bit about a topic or a time in history that you wouldn't have known about otherwise. So those are my two for today. Over to you, Tim.
D
Well, I'm going to go with two as well. These are both a couple of posts from people who have been on this show in the past. Now that I realize that both a little bit cranky, both kind of spitting into the wind or screaming into the void. Yeah, which resonated with me. So one, Juliana Jackson, her substack, she had a piece called How Silicon Valley's Hype Machine Continuously Creates Solutions Nobody is Asking for and has a lot of kind of good sort of history of past places and I think just makes some really spot on points. And then even in Juliana, I would say kind of raging against the machine is kind of on brand. The other is Medium post by Eric Sanderscham who I think of as being a little bit kind of calmer and less likely to scream into the void. But he wrote a piece called Another Bullshit Job, the AI Transformation Officer exclamation point, which is also a pretty nice little takedown on that. That front. So a couple of good reasons I.
B
Would go as Far to say, if you have transformation in your title. I'm nervous. I'm nervous for you.
D
Well, Michael is the chief podcast Transformation officer. What's your last call?
B
Thanks so much for asking. Tim. Wait, should I be nervous? Actually, it's a good question. Should I be nervous? A lot of people are nervous about AI and whether it's going to take our jobs or whatever. To that end, Cedric Patrick Chin over at Common Cog wrote something which I really like, and I'm passing out to everybody I run into who expresses those concerns. It's called a Letter to a Young Person Worrying about AI and it's an excellent read as well as being Bayesian in its approach. I realize now after the facts, which is, you know, cool tie into the show. Anyways, I highly recommend it because I think we're all having that conversation about where's AI going? What's it going to be like in five years? Are we all going to, you know, work on robot repair and stuff like that in three, four months from now? You know, whatever the case may be. So that's a really good one. All right. As you've been listening, you've probably been thinking, oh, wow, this is a topic that I had never thought about before or I want to learn more about. We'd love to hear from you. And the best way to do that is to reach out to us. And you can do that on on the Measure Slack chat group or on LinkedIn or by email@contactalyticshour.IO. and obviously we'd love it if you left us a review or rating on your favorite platform where you listen to podcasts. That's something we always appreciate hearing from you as well. And if you're a huge fan of the show and you want to show it to the rest of the world, we've got stickers and so you can request that on AnalyticsHour IO just go to our website and we will send you some in the mail.
C
Mail.
B
And we ship internationally for no apparent reason. But we do it will find Tim. Tim will get on a plane and he'll bring them to you through rain, slow snow, sleet and hail. No, but anyways. And yeah, so feel free to reach out and ask for stickers. Okay. Michael Kaminsky, thank you so much. This has been outstanding. Appreciate it.
A
Thanks for having me. This is a blast.
B
Oh, it's been such a pleasure. So thanks again for coming back on the show. We really appreciate it. Keep up the great work. We read all the stuff you put out and then we steal it and we bring it into the show as last calls and things like that. So big help to us as well out there. And I think I speak for both of my co hosts because I have a prior about that. Mo and Tim, when I say, no matter what your process, Frequentist or Bayesian, keep analyzing.
A
Thanks for listening. Let's keep the conversation going with your comments, suggestions and questions on Twitter @analyticshour, on the web at analyticshour IO, our LinkedIn group and the Measure chat Slack group. Music for the podcast by Josh Crowhurst. Those smart guys wanted to fit in, so they made up a term called analytics.
D
Analytics don't work.
A
Do the analytics say, go for it no matter who's going for it. So if you and I were on the field, the analytics say, go for it. It's the stupidest, laziest, lamest thing I've ever heard. For reasoning in competition.
C
I walk to work. I walk the kids. I mean, I can't listen when I have kids, but. Or I'm like, prepping dinner.
A
Interesting.
C
Like, you have a.
D
You have a Bluetooth speaker in the shower here or you're not.
C
I have one in the bathroom.
D
Okay.
C
I was listening. I was listening to like, ABC News radio in the morning, and I'm like, I should use this time more productively. Like, I get that much value.
B
Yeah.
A
That is incredibly optimized.
B
Yeah. Why? She's a co host on the Analyst Power Hour.
A
That's right. Yeah, I see how you.
B
That's right.
A
Why you recruited her.
C
I mean, there are many ways I.
A
Let everybody down constantly.
B
So, you know, meanwhile I'm in the kitchen being. Do you believe in life after love?
C
Well, last night I did make dinner.
D
To the dulcet tones of a song.
C
Song called Chicken Banana. I highly encourage you to go listen because it is possibly.
D
I don't think I'm gonna have time for that. Could you maybe just do it quick? I mean, go ahead. Knock it out. Go ahead.
B
You can make it your last call if you want.
C
I mean, I should get the video of Harry, but it basically is like dance music. I don't know why the kids like it, but it goes chicken banana.
A
Chicken banana.
C
To, like, high tech music.
B
That's why.
A
Right there.
B
Yes.
C
It's a thing. Yeah.
B
Something about the word chicken delights children.
A
Really? Wait, so actually, are we doing. Is this video. Do we do video or is this audio?
B
Yeah, so we do.
D
You're getting ahead of yourself.
B
Yeah. Oh, he. You know, we might have changed the checklist.
C
Witty.
A
But we won't.
D
Just.
B
What I confirmed. We won't release the entire episode.
C
Massive lag.
B
But we typically will do shorts, so there'll be snippets that we'll take out and those will have video. But the whole show, we do not do video.
A
All right, I'm going to. I'm going to keep the shirt on for now, but I might. I might beg off.
B
Yeah, that's fine. It's fine.
A
Whatever you have to do. And we won't.
B
We won't.
A
I won't be fully sure. I haven't.
B
Undershirt.
A
Try to use the collared shirt for public.
C
That is not where I thought this was going.
D
Rock flag and Bayesian enigmas.
Title: Our Prior Is That Many Analysts Are Confounded by Bayesian Statistics
Date: November 25, 2025
Hosts: Michael Helbling, Moe Kiss, Tim Wilson
Guest: Michael Kaminsky (Co-CEO of Recast)
In this lively episode, the team dives deep into Bayesian statistics, exploring its growing influence in analytics and confronting the challenge many analysts face when approaching Bayesian methods versus classical (frequentist) frameworks. Special guest Michael Kaminsky returns to demystify Bayesian thinking, debunk common misconceptions, and illuminate how Bayesian approaches can lead to more nuanced, practical business decisions. Expect baseball metaphors, Enigma machine stories, and honest exploration of real-business analytics.
| Time | Segment/Insight | |-----------|----------------------------------------------------------| | 02:19 | Intro to Bayesian statistics and its history | | 05:42 | “Best guess” vs. rigorous evidence combination | | 10:01 | Where frequentist methods are strong/weak | | 13:54 | Issues with sampling in business contexts | | 16:19 | Baseball Bayesian reasoning example | | 18:01 | Applying prior knowledge to observed data | | 20:35 | "Updating your beliefs" as Bayesian grounds | | 21:59 | Involving stakeholders/domain experts in analysis | | 24:38 | Poker and domain expertise as priors | | 26:26 | Simulation as modern Bayesian power | | 31:44 | Enigma machine/WWII decoding as Bayesian process | | 34:15 | Where bias enters analytics and how transparency helps | | 40:36 | Convergence between Bayesian and frequentist methods | | 43:00 | Multilevel models uniquely suited for Bayesian analysis | | 46:50 | Business practice: bringing domain knowledge into play | | 51:12 | Redefining A/B testing and business decision making |
Books:
Articles:
This episode demystifies Bayesian statistics—cutting through its intimidating reputation and showing with clear examples, humor, and honest discussion, how Bayesian thinking is both natural and increasingly essential in business analytics. The key takeaway: The real power of the Bayesian approach lies in its flexibility, transparency, and ability to incorporate both data and expert knowledge into coherent, actionable insights, especially in complex, real-life contexts where classical approaches often stumble.
End of Summary