
Loading summary
A
Today we're diving into a really fascinating intersection of two worlds that don't often get talked about together. Fine wine and data science. Welcome to the Harvard Data Science Review. I'm Liberty Capito, the feature editor of the Harvard Data Science Review and I'm joined by my co host and editor in chief, Shalie Meng. If you've ever searched for a wine online, checked ratings, logged a bottle into your cellar or or looked for tasty notes from real people instead of critics, chances are that you've brushed up against Cellar Tracker. It's a platform with over a million users and billions of data points. Not just about wines, but about human taste, behavior and culture really. And at the center of it all is one person. Eric Levine, Cellar Tracker's founder. What really started as a personal tool for managing his own wine collection has grown into one most influential wine communities in the world. In this episode we'll talk about how data can tell us stories about how we drink and why, how machine learning might help us discover our next favorite bottle and how a passion project truly became a global phenomenon. So pour yourself a glass of wine and let's get into it.
B
Well, Eric, so good to see you again and thank you again for joining the vine to mind back in was June now, right? And thanks for that fascinating closing keynote where you talk quite a bit about the history of how you start the Seller tracker. So my first question, because I want our listener to benefit from your rich experience, can you share us a little bit about how you started thing particularly at the time words like a big data community driven platforms were not buzzwords. So you were really a pioneer. So tell our audience how do you start it? How do you even have the idea?
C
So Seller Tracker was an accident that started in 1999 and two things happened that year. One, if you ever I was working at Microsoft from 92 until 2005 and there was a point in time when we started doing an error. We basically created an error reporting a way for people if the software crashed to basically vote on their crash. And so if you ever saw when it would say send error report, don't send, that was me and three guys in the Office team who decided crashes suck and we're going to let the world tell us about them and we're going to use that to systematically change how we build and service our software at Microsoft. So that was thing one. Thing two was my wife and I went on a bicycling trip in Tuscany. And you know, at that point I enjoyed wine. I didn't know much about wine. And it was sort of one of those things on my bucket list, like, someday I want to learn about wine, but it's a very intimidating topic. And so on the second night of that trip, after a long, tough bike ride, we were up in Castellina, in Chianti. They had a merchant come, and he poured us four wines. The Chianti Classico, a Chianti Classico Reserva, a Vina Noble di Montepulciano, and a Brunello di Montalcino, which to me sounded like complete gobbledygook at the time. But he explained it's four wines all made from basically the same grape or clones of it, something called Sangiovese. And where we were in Castellina and Chianti, you could actually see Portugal, parts of Montepulciano and Montalcino in the background. And we spent the next four days, we biked to both of those. We biked in the valley. And he basically explained four wines made from the same grape, one hill, the other hill, down in the valley, in the heart of the valley. And I would naturally assume they would all taste identical. And of course, they were completely distinct. And my mind was blown. And in tech speak, I kind of had my bit flip moment. And then you spend a week biking around Tuscany, and it's magic. And so anyway, I get home, and then over the course of the next couple years, two bottles in the basement became 20, became 200, became 700. My day job is dealing with mining and basically crowdsourced big data for software reliability. And I'm keeping this growing seller in a spreadsheet. And all respect to the Excel guys I worked on Office, I worked with them. It's a great tool. It's a crappy database. And so I had a picture in my head of a proper, normalized database that would let me keep track of my collection and also wines that I had tasted. And so seller track was born when I was on sabbatical in 2003, literally as a tool for myself. I happened to show it to two friends who immediately said, I want to use that. Which I was like, oh, I hadn't thought of that. I got them on there, and then I realized overnight, three people could be 300, 3,000 or whatever. And now, 22 and a half years later, I'm still struggling to catch up.
A
I sort of see this as you're sitting on one of the richest, I guess, very personal data sets on wine drinkers anywhere. And so, you know, when you look at the trends in seller tracker over Time just. Can you just give us sort of a general gist? What would be. We'd be surprised about what are you surprised about how people drink buy and you know, in this very interesting way that you also have it. Think about wine.
C
Yeah. So first I'm going to step back and say one thing. The core of Seller Tracker is a tool that collectors use to catalog and manage their seller. So at this point there's more than a million people registered, there's hundreds of thousands actively tracking more than 200 million bottles. And then the byproduct of that is they generate a lot of really interesting data about when to drink, 13 million ish reviews and what they have paid and all sorts of interesting things that, especially now in a time of AI, are really fascinating in terms of extracting really useful insights for those collectors themselves. But I'd say the biggest misconception or sort of unknown fact about Solar Tracker is actually the way that most people use it, about 8 million a year, is not to catalog their seller, but to go and search and research all the reviews. So it's important to sort of frame it in that there's a relatively small, albeit very authoritative and engaged and passionate audience that's generating a lot of data. And part of our goal is to figure out, okay, how do we take the wisdom of a few and make that really useful for a much larger audience? In terms of the trends, things that surprise me, I mean, first and foremost, if I look at again, just a few hundred thousand but very engaged people and I just look at the sheer value of what they buy and are cataloging in their virtual sellers each year, it is every year a billion to a billion and a half dollars of wine. We're not a commerce tool or a subscription tool. We don't really participate in any of that stuff. We don't try to sell wine to people, but they are there and they are buying a lot of wine. So it's 25 ish billion in total over time. So it's just like, wow. I would say the next thing that's shocking me is just that there is such a diversity of opinions. Actually, it shouldn't be surprising. I would say the most interesting thing about wine is that it is entirely subjective. There are so many incredible wines from so many incredible parts of the world. And I'd say early on we were much more US centric. We have not yet created localized versions. So there's obviously some selection bias in terms of who signs up and uses this. It's a lot of Americans, although half Our traffic is outside of the US but probably two thirds of the collectors generating the data are in the US but people just have incredibly broad and diverse tastes. I've hired my first data scientist a few years ago, so I would love for us to ultimately figure out how to do more research to understand what is the evolution of a drinker. I know mine personally, where I started very heavily in Washington and California. I'm based out in Seattle and Washington state's the number two wine producing region in the world. And now I'm very heavily biased towards French wines, Italian wines, German wines. But everyone's journey is different and that's part of what makes wine so exciting. I guess I would say the last thing is I am stunned at the sheer amount of time that people spend on the site. There are users who literally again, amongst those people collecting their wines, who are spending more than a thousand hours a year aggregated on the site. I mean that's a very few, but there are some doing that. And we're not super sophisticated about looking at the metrics, but mostly we're trying to look at, okay, who's engaged, who's coming back on a day, daily, weekly, monthly basis. But I would argue there's probably a thousand more interesting discoveries to be found in all the data. If we had the right way to do that.
B
Well, yes, when I listen to you, two things. One is I'm one of your customer. Actually you mentioned something. I think it's incredibly important. I think it's important these days for particular reasons. I would love to have listeners to actually pay attention to this. You're saying wine is completely subjective.
C
Right.
B
But that actually creates an enormous opportunity for us to test all these AI tools, right? Because how well they do. Have you tried to use whatever, all the machine learning, deep learning tools out there to use your data to see anything is predictable for personal level, for example, did it work for yourself? Or like if I want to use my data, I have some data on the satellite tracker too, predict my own wine drinking behaviors without any confidence that you currently can have based on what you know.
C
So we're trying to do, I would say broadly say recommendations in a number of different ways. We don't want that to be obtrusive for people. We're not trying to be a commerce tool, but we have tried to use machine learning to sort of find, okay, who's your digital twin? Right? Like I think the ideal world, like I would like to know who are the people, people in the community who are passionate about some of the Same regions and varieties that I am. My personal thesis on wine sort of engagement in collecting is, okay, you find a region, you find a variety, you dive deep. And then mostly I tend to focus on producers, less specific wines, less very specific vintages, because there's so much variance. But I generally found if there's a producer that I appreciate, there's a very good chance if I like one of their wines, I'm going to like various parts of the lineup. And so we've played with recommendations around, trying to see, okay, who's your digital twin, who has a seller or is buying wines that are similar to the things you're buying? And then again, it should be pretty straightforward and basic stuff. Are there things that they're experiencing that just don't show up in your data set? And therefore, maybe those are interesting for new discoveries. I think the other area we're really trying to dig deep into is just really answering really, really basic questions. There's 5 million wines in our database. You know, any given user's probably only got data on hundreds to maybe a few thousand wines. So when you land on the page for any wine, our first problem we're trying to answer with AI is, will I like this wine? Literally just based on the things I've tasted recently, based on my both public and private notes, based on what I've purchased recently, can we augment the standard model with this extra information and then metadata about the wine and then see if it will come back and basically make a value judgment on am I likely to like the wine? And by the way, one of the things we found is it is really difficult to get these models to not say something nice. They always want to say, well, you know, there's some redeeming quality here. So, like, we have to go out of our way, like. And by the way, the first, like, wine, all respect to everyone and the wines you love. You know, I'm more old world. When we started rolling out this tool in beta, like, 90% of the people who were trying to use it. Here's a surprise, or maybe it's not a surprise. See if you could guess what wine they used first to try to test it.
B
California wine.
C
Yep. Camus. Everyone put in Camus because it's such.
B
A really chemist is good.
C
It's big, it's powerful. There's a little residual sugar. People love it. People hate it. And so I drink mostly Bordeaux, and the model at first seemed to really be convinced I would like Camus. And I was like, guys, all respect. Like, you know, I mean it's not what I seek out. So they figured out how to get it to at least differentiate that. But by the way, what I found more useful is we haven't sort of make the value judgment of, you know, what's the likelihood you'll like this? But then we have to give like a two sentence summary to explain its logic for why it thinks I would or wouldn't like it. And to me, those two sentences, even if it kind of gets it wrong, almost always make clear. It's like if a I read a Robert Parker review and he's talking about thick fat tears flowing down the side of the glass and he's talking about a 1560% California cabernet without any hint of olive or leaf or stem or any vegetable flavor. All respect. Like, I like Bordeaux because it's earthy, it tastes like tobacco and it's Cabernet, but it's a little leaner. I don't care about the actual alcohol percentage, but I want a different flavor profile. So I can in the two sentences often have my own bias. So anyway, so that's one. But actually the thing, and this is new, we have a summarize this wine and it does almost like a wset just description of the wine.
B
Oh, that's nice.
C
Around the producer, the vintage context, it looks at the notes from the community, both in the current vintage and nearby vintages. And for someone who wants like a one pager, which I'm a textual learner, it's perfect for me. Like I read that and I'm like, oh, this is awesome. I understand exactly if I'm curious and want to dive into this or not. I would say the other thing is from a user experience perspective, right now in our app we're only exposing a lot of these AI tools to paying customers right now. Number one. So Solar Trekker you can use for free. And if you pay, you get a bunch of extra data and functionality, including the AI tools. And it's in what I call AI jail. You know, in every app or website you see like a little button floating in the lower corner and you click it, a little window comes up. Floating action button we call it. And you know that's where you can get to these prompts. And in the next generation of this, we're going to start to make it more native. When you go to a page for wine and frankly a bunch of the content above the fold that's top and center will be like a short summary of the wine that is AI generated from our data and the broader set of data and shown to everybody, not just paying users. The idea being we just want to help you in sort of in a nutshell figure out, do you want to keep scrolling and reading?
A
Maybe this is my own bias coming out just because of how much I love wine. But you know, recommendation engines can be so powerful. But I always see wine and just as you said, subjective. It's so personal, it's cultural, it's emotional. How do you keep that human element? I was on an unnamed website and like I had put in a couple bottles that I'd recently liked. Like, it was like, you know, an opus or a sea smoke or I think it was maybe Cirque or something. And it just automatically popped up that I should buy Camus. And I'm, I'm on the side of I don't like Camus. So it, it just sort of was like looking at a price point and a, you know, a name brand that people know and it just spit out that that's what I should, should drink. So how do you keep that sort of human element and not just have here's a price point, here's a name of wine that a lot of people know and that's what you should drink if you're one of those drinkers.
C
Yeah, I guess. I mean, part of it also is you're talking about another company, another product, another tool. We have the advantage that our business model is focused not around commerce at all. I've been, you know, in 22 years, I've seen a lot of line technology things come and go and I've made very conscious choices about the areas I was focused on and the area I wasn't focused on. And I was a software nerd. I worked on building Microsoft Office for 12 or 13 years at Microsoft. So I started around a productivity tool that has turned into at least a data set that hopefully now we're trying to evolve the tool that is around research, discovery, exploration and just trying to bring people in with no agenda. So if you're trying to sell wine and one of the most commonly sold wines is Camus, it's going to be hard to not have a tool to recommend that. I guess the other challenge of wine is it's a super long tail problem, which is really hard. From a commerce perspective, it's super exciting. From a discovery exploration, you ought to try perspective, at least getting people to read and think about other regions, other producers. So there's a lot of Camus on Seller Tracker. There's an amazing amount of Veuve Clicquot on cellar tracker Super Duper. I mean again, for people who love those, they're great wines. I think Veuve is remarkable given the amount they make. What a wonderful quality product it is. But there are so many wonderful unknown sort of grow champagnes and things out there with different styles and so much diversity. So how do you, you know, part of what we have to try to do is expose people to that. So it really depends, you know, if the model is biased towards what's popular or that's the question the user wants to answer, then, you know, then you're going to get a certain sameness. If people are looking for hey, show me something new or help me find that hidden gem, then that's, that's the trick. I think the challenge is having enough data to do that in a way that's actually real and statistically significant, et cetera. And again, I'm not a data scientist, I'm not even a computer science. I studied computer science at Harvard by the way, but my degree is in history, so I've stumbled my way through software for the last, whatever it is, 35 years.
B
Oh, that's even better. So you can study the history of wine, which is very, very long and give you a lot more data. Speaking of data, let's talk a little bit more about the data quality issue. Because no matter how fancy your AI tools or deep learning tools are, they just really can't beyond what the data can tell them. So particularly as a seller tracker user, I think I mentioned this before to you. I identify problem with myself in terms of keep, you know, high quality data. When I buy some wines, I will record them as many, you know, collectors will be. I'm a kind of a small collector myself and but here the problem though, when I drink about, consume about bottled wines and then all friends come in, we take a bottles, we drink, then I just forgot to remove it. Recording is much more diligent than you take out. And sometimes your family member takes out the bottles. So there's all these things.
A
Yeah, they're sneaky family members.
B
Right. You should lock your wine cell always. But anyway, so there's other problems. People misspell their names, all kinds of other stuff. What do you do there? Do you do anything trying to make the data a little bit more reliable, which is obviously good for everyone or you just let whatever the behavior is just the behave itself well.
C
Okay, so when I think about data quality, I'm going to mention three different issues. One would be, you know, what you started, what you led with was just sort of you cataloging your collection and is that up to date or not? Because again, at the core of Seller Tracker, it's a productivity tool. So anyway, we're trying to figure out how do we make the tool, you know, even just with the data you have, even if it's a little out of date, more useful for you and make you feel like, you know what, maybe I get a snapshot of that in there and then still the kinds of recommendations and insights it can yield for me are still super duper useful. Even if everything isn't up to date. Then there's just a sheer software productivity perspective. How do we lower the friction? You can take a picture of the label, you can take a picture of the barcode, but what can we do to just make it so much easier to just get the stuff out of your cellar? I will say we have played with, with AI together with computer vision. If I were building a tool like Vivino or something else today, for example, would I be using traditional computer vision or would I be using an AI model to go and look at the label and figure out what it thinks the wine is relative to a Data set of 5 million wines? And so we've actually put something like that together as sort of a fail safe. If we use Vivino via their API there, we partner with them. They don't get it right all the time, or sometimes they just don't return a result. So we take that same image and we turn around and we give it to an AI model and we actually find it does a really, really good job. It's a little slower. So, you know, it could be that that's the future of label image recognition. But, you know, imagine you just take your empties and you stack them up. You know, if you could just wave the camera by it and it's like, okay, I got them. You know, the universe of your cellar is dramatically smaller than the universe of 5 million wines, right? And so it should have a pretty good idea from all the images we have on all of those wines, which they probably are. And it can just be basically give you the list of, hey, we think these are the ones you just drank, so just check them off and remove them. Bam. Right? So that's productivity tool quality number two. Just the wine database itself. It's 5 million wines. We grow. We actually get a thousand per day added. It's been growing slowly and methodically for 22 years. I've got a team now of five who just both help users create new wines. So we But a lot of users create them themselves. But we also have experts who will do that and, or who are curating. There's so many people who are so diligent. And so if they see duplicates, we try to make it easy for them to report those to us. And so we are constantly merging, coalescing, cleaning. There's a core search problem also that even if the database is a little messy, there's a lot of signals in there around just the sheer amount of user data. So if someone searches, we're trying to bias to hand back the wines that are more populated versus someone created an erroneous wine definition and didn't attach any bottles or any reviews to it. Probably you don't really want to return that unless there's just no other match. Third thing would just be more malicious. So think like Amazon reviews, because Amazon has a commerce motive. When you go and read those reviews, there are very good reviews. There are reviews, there's whole scam models built around getting fake reviews up onto Amazon, et cetera. And we've had teeny problems with, with maybe say an occasional importer or a winery going and posting fake reviews on seller tracker, rarely maliciously. Mostly it's just more people being confused. We ask wineries not to review their own wines on the platform just for integrity. So the more successful we get, if we ever are a significant driver for commerce, you have to assume that's going to happen. And so you need to threaten all that and think about, okay, how do we filter that out? How do we rely more on reputation of, of people posting reviews? And generally it's in the rare cases when people have posted fake reviews, either very positive or very negative, it sort of stands out like a sore thumb. And our users report it to us. It happens so infrequently, it's stunning, but it will happen if we're ever more successful. So anyway, those are a few different dimensions that I think about it on.
A
You know, thinking about the dimensions or that word dimensions got me thinking just about the dimensions of, of how we sort of describe wine. I mean, you want to talk about like colorful language, wet slate, bruised pear, forest floor, you know, I mean, all these crazy words, right, that we don't really use to describe anything else. So if you all sort of explored using natural language processing or AI to really understand and sort of categorize this really, really very unique language of wine.
C
So to me, wine's greatest strength, its incredible diversity, the fact that it is entirely subjective and everyone can choose their own journey, also creates Such a barrier to entry. It just, it scares people. They don't know where to begin. Until you've really dove in and tasted some wines or had sort of a seminal experience, it's just a scary topic for people. And so people aren't really sure, do I like this? Am I supposed to like this? How do I talk about this? And our aspiration is, how do we just bring more people into wine and help them experience the incredible joy and endless diversity to the extent that they're curious and want to. Thing too, I would say is for people who are very into wine, people describe wine very differently. And I'll say very broadly. There's two schools. There's what I would call descriptivists, like the people who are popping out a million different adjectives. This tastes like this and that and that and that, like miles from sideways. Right. And then people who are more structuralists where they are commenting more on the general approachability and textures and broader descriptors and less specifics. And I would say the descriptivists are scary for people like, oh my God, how do you taste all those things in there, right? And people. It adds to the stage fright of, should I say my opinion out loud? Should I write a note publicly? Will people mock me? Will they disagree with me? Will they think it's too simplistic? I think everyone, I just want them to have the confidence to answer the most basic question. Did I enjoy this? Was it great? Was it meh? Or I actually really didn't enjoy it. And then the more they can understand why or we can help them understand why, then you're on your way, you're on the journey.
B
You talk about whether people like their comments to be seen by others. And everything's very personal. And I think the whole thing is we want to be as personalized as possible, but at the same time, there's always a data privacy issue needs to be balanced. So just broadly speaking, like for Seller Tracker, what are your policies, what you have done in terms of protect people's data privacy? Because there could be information leaked there, if not intentionally. So just want to see what are the general methods or principles are in place.
C
Yeah. So I would say when it comes to privacy, specifically on Seller Tracker, first and foremost, we try very hard to make sure people understand. Look, if you put a collection on Seller Tracker, number one, your username can be completely anonymous, you don't have to know the user's name or email, etc. That doesn't have to be public. Number two, they can make the collection entirely private. Because I think that's important, right? People don't necessarily want to go and put their entire collection out on the Internet. Then when it comes to people expressing their opinions in notes, I will say seller tracker's key differentiation as a platform has been around community. I really sort of saw the potential value of a community going back to the earliest moments in 2003. So I did make it so that if you wanted to post a note there's a private, there's private note fields. But the most attractive place to post a note where you can have a date and a score, et cetera, is public. So I wanted, I intentionally were like, look, if you want to use this it's going to be public. And I didn't want a simple on off switch. And the fields where you can store private stuff are a little less attractive. Right? You get one private note per wine. So you know, and I think that was a good bet. You know, we've created the platform that has gotten this flywheel of a lot of great user generated content. Beyond that you just come down to just physical security, of trying to do best practices around password security. And we try to be very transparent about how people can control their data. Obviously if someone wants to be removed from the platform, we delete everything. Even though there's a lot of stuff in our license terms that would allow us to keep people's reviews. And we just are like, of course you want to be removed, boom, you're gone. And we want people to control their data. And then the last thing I would say is we try to use the data in the most above board transparent ways possible. All of our focus on these models is around how do we take the insights, insights from our community and make them useful for the users themselves who generate the data first and foremost and then hopefully useful for the broader community. The phrase I use to the team is no creepy. And then the last words occurs, right? Like we just try to be different from a lot of other platforms like don't do anything creepy with users data. Let's be just above board. And you know, we're trying to be.
A
Very consumer centric, beyond privacy. You know, there's so much, I think sort of pressure with wine. I mean you go to a restaurant, there's always that classic movie scene, you know, where the guy takes a girl out on a date and he doesn't know what bottle of wine to order so he doesn't order the cheapest one. But he goes like, you know what? I can't remember what it is.
C
People get the second most down.
A
Yeah, the second cheapest or whatever. There's so much pressure around it and I just think with user experience, sort of part of what is so cool with the idea of Seller Tracker is you can start to sort of really build there ever sort of a future for Cellar Tracker where it would be integrated into wine buying websites where, you know, you'd integrate that all together, I.
C
Would say, you know, around research and discovery like, you know, the pivot we're partly trying to make is to have Seller Tracker be more useful for people who don't necessarily just want to catalog wines in the house, but any out of home experience, whether I'm at a retailer, at a restaurant, how do we help people? Again, answer really simple questions. Is this a wine I'm going to like? Is this a fair price to pay? Is it going to match well with whatever food I'm about to order, if I'm at a restaurant or what I'm intending to make for dinner, etc. So internally we're working on a feature called Restaurant Hero. And I don't want to over promise because it's a particularly hard problem, but we're hoping by the end of this year to have the beginnings of a feature that, that lets you walk into a restaurant and will have it suggesting three or four wines from that list, maybe even crawling the restaurant wine lists in advance and knowing when you've walked in the door of sourtrackers running, oh my God, you're here. Here's the list, here's the wines that are a good deal that you're probably going to like so we can make this so much easier. And part of the other thing is it's either there's the wine intimidation factor or even if I know a lot about wine, like I would love nothing more than to geek out on the wine list and talk to the somme. My family or my friends are like, you know, enough, right? Like, order the bottle of wine. I'm thirsty.
A
Yeah, no, I was even just thinking, you know, Shalie and I actually always talk about last bottle or last bubbles. I actually think, Shelley, I think I'm gonna take credit for showing you the last bottle app. But, you know, I've always wondered how cool would it be if you could get an alert when there's a sale on last bottle, but it matches what's in your, you know, that is your taste profile based upon what's in your own cellar that just sort of seems like the world is endless in terms of matching cellar tracker. With so many different tools.
C
I don't, I don't like to throw anyone under the bus, but I would say, look, there's a lot. Commerce has come a long way in a lot of categories over the last 25 years and wine still feels like it's circa 1999. And I will just say I only buy from a couple of winer. I have a lot of wine. I buy from a few retailers and a few wineries. When I look in my deleted items folder at the end of each week, there's a hundred messages in there. And generally the way the industry works is just hammering people via email again and again and again. And it's either like, here's this one truly special wine, oh my God, you have to buy it, or it's here's the thousand things we want to sell. And I'm like, I don't even know what to do with it. Right. And so I would love a world where we are ingesting all of that and running it with the user's control on our platform, where we care deeply about privacy, not exposing any of that to any retailers through these models. And coming back with, here's the hot list of what's being offered right now that's actually relevant to you. And I think so. I don't want to, I don't want my business to be based on commerce, but my goodness, I would love to disrupt wine commerce in a way to just make it easier for people, help connect people with the great wines they want to drink.
B
Well, that would be really fantastic, not only for the wine lovers or for the public, but really for the kind of AI data science itself. Right. Because if we can really personalize in the wine space, we probably can do well in some other spaces as well. Right. Because as you know, wine is so many variations, so many factors. So if we fast forward 10 years, maybe even 5 years, how do you see the current. The AI and particular analytics will change how we collect, discover and experience wine. Any predictions, any thoughts, you have, visions.
C
I know AI is going to help us do a much better job, I think in a consumer centric way to make wine research, exploration and discovery more approachable, make commerce easier and better, bring more people into the category. And I think it's hopefully going to help the people in the commerce side of the business. And there's many elements to that do hopefully a better, more consumer centric job. And then finally the wineries and vineyard owners who have an incredibly hard job in a time of potentially decreasing demand and really difficult agricultural Conditions. I was just down in Santa Barbara last week and, you know, was visiting a vineyard that had been very heavily impacted by smoke. And when they have to throw away, you know, an entire chunk of a vintage, like, it's just devastating. So, you know, if nothing else, I think I hope these tools are going to help them as well and then even get, you know, then you get down to technology around clonal selections of wine and yeast and how you make adjustments so you can still making products that are more palatable even when the environmental conditions are changing. So I think AI is going to change the world in this space for the good period and a lot of dimensions.
A
We always wrap up with a magic wand question, and we actually had so many that we each have our own. We have two magic wand questions. So I will ask mine and then I will let Shali ask his. So mine was, if you could get a case of any wine in the world that you do not already have, what would it be?
C
Any wine in the world that I do not have one case. You know, it's a wine I've only had a couple of times. I would say Petrus, just truly remarkable Merlot based wine or actually, no, no, no, back up. Lapand L E Space P I N. I've tasted it exactly once in the vineyard, actually, with the. With the vineyard owner. After a conference in 2010 held in Bordeaux, the Master of Wine symposium where I was a speaker. And yeah, I can actually still taste and smell it now. And that was 2010, so 15 years ago. So sorry. All respect to the Petrus guys, it's La Pan.
B
Well, Eric, you remind me that we probably should invite them to join the wine to mind next year.
C
Indeed.
B
But my question is not really a magical wand question, but this question recently, somehow, among my wine friends and others, it's a way to characterize the wine you truly love or you truly want. The way it's framed is what is the last wine you want to drink?
C
The last wine I want to drink. I mean, here's the challenge. I suspect I'm hoping I don't leave the world for a while. And I suspect between now and the time I leave this world, my tastes will show shift. But I've. I've been in one, not even a rut. But I would say the area that I've really appreciated the most over the last probably 10 years is the southern Rhone Valley. In particular, Chateauneuf du Pape.
B
So.
C
And I've got a very clear favorite producer, and that has continued to be the case, actually. Going all the way back to 2007. And that is Domaine de Pigo PGAU. And so it would be bottle of Domain de Pigo. Any vintage is good. I love a Muran. And you're out for $50 bottle of wine. To me, that is. That's always my happy spot. It literally, like, I know a wine is good, if pull the cork out, pour a glass, and on that first sip, I involuntarily go, yeah, then you know it's good. And it, it happens most of the time. Right. It's just something about it. There's an energy for our listeners.
A
That was a. There was a fist pump in the air there.
B
So that's a fantastic endorsement.
C
Yeah. So it's Chateauneuf du Pape, but. And there's. And there's many that I love, but that's.
B
I love the guy. I was asked that question several times. I said, I was saying, yeah, what's yours? Honestly, I don't know, because things are evolving, as you said, and I have too many things I like. But as Diburi said, we certainly have a lot more questions, but we probably should have saved these questions over some bottles. Next year, when you come to the vine to Mind, I hope you will join us again and for our listeners that if you are interested in joining the vine to Mind symposium next year will be in UC Davis, the Mondavi Institute, from May 18th to 21st. And for those who are wine lovers, you probably know the significance of that week. We're going to celebrate the 50th anniversary of the Judgment of Paris. It happened in 1976. May 24th. But with that, I want to thank Eric again and both for your wonderful presentation at the vine to Mind and today's, you are sharing with or listeners your experience, your journey, your vision. And in the end, I think we all understand that wine, it becomes so enjoyable, powerful is because it's all about sharing.
C
Exactly.
B
And I think that sharing experience is what bring all of us together. But I do want to remind all the listeners, you know, you always drinking responsibly. The whole idea of wine is to bring happiness to your life, not bring some, you know, miserable. So with that, again, I want to thank Eric and I hope to see you soon.
A
Thank you so much, Eric.
C
Of course.
A
I'm Liberty Viddert. Capito. And on behalf of Shao Li Meng and our guests, thank you for joining us. And a special thanks to our producers, Rebecca McLeod and Tina, Toby Mack, and assistant producers Arianwyn Frank, Gavin Yang and Belle Riley. This was the Harvard Data Science Review. Everything. Data science and data science for everyone.
Episode: Tracking the Most Intoxicating Data: A Conversation With Eric LeVine
Date: November 20, 2025
Guest: Eric LeVine, Founder of CellarTracker
Hosts: Liberty Capito (A) & Shali Meng (B)
This episode delves into the unique intersection of fine wine and data science through the lens of CellarTracker—the world’s largest wine tracking and discovery platform. Eric LeVine, CellarTracker’s founder, shares how his personal passion evolved into a community-driven, data-rich platform, and discusses how user-generated data, AI, and machine learning are transforming both wine appreciation and the wider field of recommendation systems. The conversation covers the challenges and opportunities of translating subjective human preferences into actionable recommendations, as well as the importance of privacy, user experience, and data quality.
On technological serendipity:
“My mind was blown. And in tech speak, I kind of had my bit flip moment.” – Eric LeVine [02:07]
On the scale of wine data:
“Every year, a billion to a billion and a half dollars of wine ... 25-ish billion in total over time. So it's just like, wow.” – Eric LeVine [05:19]
On AI’s limitations in taste prediction:
“It is really difficult to get these models to not say something nice.” – Eric LeVine [11:53]
“When we started rolling out this tool in beta, like, 90% of the people who were trying to use it... Everyone put in Caymus because it's such...” [11:54]
On user engagement:
“There are users ... spending more than a thousand hours a year aggregated on the site.” – Eric LeVine [05:19]
On wine’s intimidation factor:
“I think everyone, I just want them to have the confidence to answer the most basic question: did I enjoy this?” – Eric LeVine [23:25]
On ‘no creepy’ data:
“Let’s be just above board ... don’t do anything creepy with users' data.” – Eric LeVine [25:39]
On his personal wine ‘magic wand’:
“Lapand ... I can actually still taste and smell it now. And that was 2010, so 15 years ago. So sorry. All respect to the Petrus guys, it's La Pan.” – Eric LeVine [33:56]
On his ‘last wine’:
“...the area that I’ve really appreciated the most over the last probably 10 years is the southern Rhone Valley. In particular, Chateauneuf du Pape ... Domaine de Pegau.” – Eric LeVine [35:23]
(with a joyful fist pump)
The conversation is characterized by Eric’s humility, passion, and commitment to user experience and community. He emphasizes the joy of discovery and inclusivity in wine appreciation, while candidly sharing the technical, cultural, and business complexities involved. The episode invites both wine novices and data enthusiasts to appreciate how personal taste is becoming part of the new frontier of personalized, responsible, and user-empowering technology.