Loading summary
Michael Helvel
Foreign.
Jacob Mattson
Analytics topics covered conversationally and sometimes with explicit language.
Michael
Hi everybody. Welcome to the Analytics Power Hour. This is episode 300. This is analytics Power Hour. Okay, sorry, that's just a dumb joke on the movie.
Michael Helvel
Okay.
Michael
Every time you turn around I think you're hearing about the semantic layer. We even did a show on the topic recently. It's what AI needs to be successful. That's at least what we keep hearing from the vendors. And well, honestly that was right around the time we ran across an article that grabbed our attention. What if we didn't need semantic layers? Then we read an article by OpenAI. They published about how they're using AI to analyze data and took a closer look at a couple of vendor websites. We started seeing that context for AI isn't only a semantic layer thing and well, we wanted to talk about that. So let me introduce my co hosts, Mo Kiss of Canva. How you going?
Mo Kiss
I'm going great, thanks for asking.
Michael
Michael, I see you're down a remote on the wall there, so.
Jacob Mattson
Oh jeez.
Mo Kiss
I love when we give a visual in joke that no one else can follow.
Michael
Audio podcast. It's okay. Some we'll make a clip out of it anyways. Julie Hoyer of Further.
Mo Kiss
Welcome.
Michael
Glad to see you.
Julie Hoyer
Hello. Hello. Glad to be here.
Michael
Awesome. And go Browns and Michael Helvel. So naturally we reached out to the author of one of those articles and I'm excited that he is our guest. Jacob Mattson is a developer advocate at Mother Duck, the cloud data warehouse built for answers. He has also held senior data and accounting roles at firms like Symetrix Funko and Vera Matrix. And today he is our guest. Welcome to the show, Jacob.
Jacob Mattson
Hey, Michael and Julian. Mo, it's great to be here. I'm super pumped.
Michael
Awesome. Well, we're excited to have you. So I think Jacob, maybe to kick off the conversation, I think it would be great for us to understand a little bit more both about your background and exposure to this and sort of what started formulating for you that led to you kind of digging in and doing research in this area around sort of semantic layer or not or other alternatives to semantic layers.
Jacob Mattson
Yeah, that's such a good question. I guess I'll like. I'll start with like a little bit of biography that got got us here. I' not to be too self indulgent, so when I graduated from college, I worked in accounting in public accounting. I sat for the exams, did all that stuff and worked in public accounting and then at a company called Vera Matrix. It was called Out There doing all of the normal accounting things, climbing the ladder in a very specific kind of governed way, working for people with titles like controller or cfo and eventually taking some of those on myself. I think one of the things that we always talked about was, especially on the financial side was like, how do we know what numbers are right for this definition of this thing? Right. And at the time, the tools we had were much, much worse than we have now. I remember it being a big deal when we got the Excel that could handle 1 million rows and not just 65,000 rows. That's why Excel 2003 maybe, which of course actually what all it did was train everyone to just have a horrible experience in Excel all the time and just be totally fine with it. It's like it's just too much information, it can't handle it. And we all dealt with saving issues and, you know, crashing and all these things all the time.
Michael
Yeah, don't calculate your metrics until you're really ready.
Jacob Mattson
Yeah, exactly. Turn that automatic calculation off.
Michael
That's right.
Jacob Mattson
Yeah. Step, step one. Yeah. Step two, save it as a binary format Excel file. So did all those things, you know, and had the pleasure of working on things like, you know, MDX and DAX along the way, which are both of kind of like the modeling languages that are built into the Microsoft stack. And kind of along the way through that journey, really kind of my way towards using SQL for a lot of the work I was doing. And that just kind of naturally came out of the data that I had was too big for Excel and it was too complicated and there was lots of really, you know, I was just driven 100% on just like the business need for solving these problems. And I needed to get more robust tooling. And we had SQL Server and it had more, you know, it was running on a server that had more compute than my laptop and all these things. And so it was very natural to kind of progress up there. And so I've done lots of fun things kind of in that space. I kind of like to joke that I always worked in data from the beginning of my career. It's just my pipelines ran once a month. It was a month end close process for those of you at home. Anyways, that's kind of how I got there. And so you do a lot of things along the way and you see lots of errors along the way too. Some material, some not right. And I worked on the IT side at a public company and you see lots of interesting things produced internally that never make their way into the, the filing documents, for example, sent to the sec. So I think for me kind of some of the genesis that led to this notion of do we need semantic layers anymore? Was two things. A working in accounting for a long time and understanding the quality that goes into those numbers, which is very high, but also not as high as you'd like is what I would say. And then the second part of that is that we often, at least what I would see on accounting, the accounting side was like we would get too precious about the exact accuracy and precision of a number instead of, instead of actually moving the business forward. Right. And so what actually happened in my career at least is that like it seemed like I wasn't that close to it early in my career. But what it felt like is that like CFOs in particular completely abdicated the realm of analytics to like some other domain. Like you know what, we can't get accurate enough, it's not good enough for whatever reporting we're building. We're just going to let some other part of the org like handle that. In fact, I remember even seeing like job descriptions that were like Director of Analytics, non finance like type, type of roles. I experienced that when I was, when I was at a company growing really fast and was, was trying to build all these things and I built, you know, our first data warehouse from very much a, you know, accounting principles first. And eventually it just got to the point where we had to break that apart because it was just we didn't have the right primitives to answer the questions in a way that we were, we were comfortable with and we tried all the things and it was just really hard to manage and to update. So a little bit of this IDE is me manifesting. What if I just took away all the pain that I experienced when I was building these analytics cubes back in the day? What if we could just ask AI those questions and it would reformulate those on the fly. And so that was kind of what led me to exploring the idea and then beginning to do research around it.
Mo Kiss
Can you tell me a little bit about the tension that you kind of touched on? But we didn't go deep the tension. So I mean we talked about this with Cindy a little while back about semantic layers and just like it's been sold at the moment is like the holy grail as every new, not actually new idea in analytics is that all will solve all of our problems. But that like that core tension that you touched on, on how hard it is to maintain the like inflexibility perhaps that then makes you not able to answer your business questions, like, what were some of your lived experiences?
Jacob Mattson
Yeah, this is such a. Such a good question. I mean, I think, like, the first one was, I think we. So I was working on an ERP system and we were like, hey, we want to implement, like, better reporting and analytics on it. We're going to buy this software package that will just like, automatically build out all the OLAP cubes for us, and then we can, like, tie them into our BI tool. I think this was even like, pre Tableau maybe. And, well, tableau existed, but it didn't exist for the company I worked at. I think I would say it was certainly not something I was tracking at the time. You know, we bought the software. And then I was like, okay, now let's like, implement it. And it was like it had to fit in a very narrow box for us to actually take advantage of it. And that was a pattern I saw repeated a lot kind of in the ERP space too, which was like, hey, just like, you know, make your business fit into this template of how we run our systems. And then you get all of these awesome synergies or whatever, right? Like, now you don't need people to do your buying. You just, like, run this report and it tells you what to buy. And so what I kind of came to believe, I think I definitely wanted to say, all right, let's just apply this system. Let's apply these SAP primitives that are very old and well tested. Let's just apply this blindly to our business. Why are we overcomplicating it? Our business is not this hard. But then what I kind of discovered is that the interesting parts of your business are really hard to define in someone else's model. They end up being. Unless you're a pure commodities trader, the magic happens kind of in the margins. And so defining those systematically is super hard. I really struggled with that. And so when I kind of realized that that was the mental model that I was bringing to these problems, I started being like, hang on, how do I design this ERP system that we're working on? That was my accountability. How do I make it so that we can do the thing that we need to do to make the system work, but also we allow kind of space in the way that we interact with this so that the magic of the company, what differentiates the organ, can still happen too. And so once I started thinking about it that way, that really kind of unlocked for me kind of a way for us to move forward. And it was much less difficult to kind of get people on board because it wasn't like, hey, we're going to totally change your job and make it so that it just fits into this box. It was more like, okay, how do we meet in the middle? And so I think there's a little bit of a paradox in that, right? Which is like the paradox is that you need some level of conformity across the organization for everyone to be able to communicate well. But also if you have too much conformity, you have a commodity and you need space to have some sort of differentiation. And so I think that's kind of the perspective I brought there. I think figuring out which constraints, I kind of think about it sometimes. Like this game, Jenga, I don't know if you all played that, but you have a stack. Yeah, okay, perfect. So you have this stack of blocks, right? Some of them, you touch them and you're like, okay, I can't pull that one out. That one has to stay there. It's load bearing. But you only figure that out kind of existentially. So for me, I spent a lot of time just trying stuff and like, okay, you know what, that didn't work. The CFO just got really mad at me. Won't do that, but let's try this other path. And I think a lot of it just became like. And then at the end you have this beautiful tower, right. Hopefully you don't knock it over, but you have this beautiful tower. And that tower is as you unique shape that like, hopefully fits what the actual. What, you know, actually represents what the business is. And so that's kind of how I think about it.
Michael
Tim, have you ever opened GTM preview mode and immediately thought, well, there goes my afternoon?
Michael Helvel
Absolutely. Nothing says fun like hunting through a giant pile of tags trying to figure out which one broke?
Michael
Yeah, that's why Stape built the Stape GTM Helper, a free Chrome extension for debugging Google Tag Manager.
Michael Helvel
And free means actually free. No signup, no subscription, just install it from the Chrome web Store and start debugging.
Michael
And it works with both web and server side gtm. It helps you focus on what matters by filtering down to the specific tags that you're testing.
Michael Helvel
Your tags from Google Meta and Microsoft are color coded so they are easy to spot. And it makes JSON payloads readable instead of, well, whatever they normally are.
Mo Kiss
Yeah.
Michael
Plus for server side gtm, it gives you better visualization visibility into consent status and it can help with Shopify checkout debugging too.
Michael Helvel
There's even a website tracking checker that gives you a web and server side tracking report with actionable fixes.
Michael
The Stape GTM Helper is a must have for anyone deploying or managing tags in gtm. Search for Stape GTM Helper in the Chrome web Store or use the link in the show Notes. It's free, it installs fast, it might just save your afternoon.
Michael Helvel
Michael, where does your best AI analysis live right now?
Michael
Oh, I've got this Claude conversation called GA4Help for this meeting I've got coming up. It's buried between a lunch recommendation and me asking it to explain regex to me like I'm a fifth grader.
Michael Helvel
Exactly. That's the problem. Your AI work gets trapped in one chat with one person in one thread.
Michael
Ah, yes, the modern knowledge base. I swear Claude told me this somewhere.
Michael Helvel
And that's why Ask why built Prism with memory and shared context across users.
Michael
So the useful stuff doesn't vanish into my private little AI cave. Exactly.
Michael Helvel
It's out of the cave, into the sun. Prism keeps the context, your metric definitions, source of truth tables, business rules, prior analyses, and makes it usable across the entire team.
Michael
I like this. So if I teach you that active user means three sessions in 30 days, Julie doesn't have to teach it again tomorrow.
Michael Helvel
Exactly. And if val runs a GA4 cohort analysis, that knowledge can live in Prism, organized and traceable, not locked inside her chat history like a tiny little analytics hostage.
Michael
I am starting to like this team memory. Not Ask Michael because he remembers the cursed dashboard lore.
Michael Helvel
Plus, with Claude, Cowork and Prism, your analysis becomes shareable, auditable and ready to build on.
Michael
I like this. So the AI becomes company knowledge, not just some vibes I had with the chatbot at 11:42pm that's right, because that's
Michael Helvel
way after my bedtime. So go to Ask Y AI and join the waitlist and you can use
Michael
the code APH and I'll take you to the top of the list. That's Ask. Yes, the letter y.AI code aph. Because your team's brain should not be trapped in one person's chat tab. Exactly this.
Mo Kiss
Everyone can see my face. Obviously not our lovely listeners, but everyone on the podcast. I love an analogy. My company loves an analogy. And I feel like this Jenga1 is I'm going to take a way too far. Because at some point you do knock it down. Like, that's the reality of like when we build data architecture and systems. Like at some point you do end up rebuilding. But I think the. The exact tension that I feel right now, and I keep banging on about, like, 80 to 90%, we do need standardization. We do need some, like, conformity, because otherwise, if. If one person over here calculates it this way and one person, like, we can't ever have a mature conversation. But I think the. The really challenging part is how we get that. That 10 to 15 or 20% that should be bespoke or, like, is actually a unique situation. And I, like, I'm thinking about. Those are the Jenga blocks that you push through and you can move. And I. But how you just have such a wealth of experience here. Like, it sounds like for you, a bit of that was trial and error. If I want to learn from all your trialing and erroring, how do you figure out what the standardization bit is and where the flexibility needs to be?
Jacob Mattson
Oh, that's such a good question. You know, I think. I think some of it is, like, being able to zoom out and kind of, like, understand what the engine is of the company. Like, how does it actually. How does it function? Like, what's differentiating about your competitors? But then, like, also, like, how does that build the feedback loop that, you know, ultimately, you know, increases the cash on the balance sheet, hopefully. Right? So, like, you know, I always felt like that was an advantage for me as someone coming from an accounting background, where I'm just, like, it's very easy for me to visualize, okay, if this business is graded X, they will increase their cash flow. Right. And so I think a little bit is, like, developing good intuition around that. And I'm very thankful to, like, have grown up and, you know, working for CFOs who are very excellent mentors, as it related to that. But, like, you know, I think. I think the second part of that is you always have to. Your model for reality is imperfect. Right. And so you need to kind of be able to test that model in a way that is sort of safe. And what I mean by that is, like, you don't get fired if you're wrong. You might get reprimanded. That's okay.
Michael
That's the threshold.
Jacob Mattson
Yeah, exactly. Like, you can take some risk, right? But you.
Michael Helvel
You.
Jacob Mattson
You want it to be the right risk. And so I think a lot of. A lot of the. The trial and error part was like, just, you know, how do we calculate? You take some risk here that is not, like, too. Too drastic, but is opinionated in a way that, like, me, if we're correct, we win more.
Julie Hoyer
Do you have an example of like, what that risk is. Like, I don't know why I'm having a hard time conceptualizing, like, the risky metric. Like, Mo, in the previous episode that we talked about semantic layers, I think you had thrown out the example of, like, monthly average users when we were talking about semantic layers. Like, is that a risky metric? Is that a metric that moves the business forward? Like, can we, like, talk through a metric like that for a business like canva? Like, I'm just, I think, I think
Mo Kiss
monthly active users is one that I wouldn't take a risk on. And that's because it's one of our company foundational goals. So, like, we have long historical reporting, but actually, like, where it does get complicated, right? And I'm going to give you a specific example is like, we often will look at monthly active users by different products and sometimes different people have different interpretations of, like, what that product, monthly active user is. And that's why there's, like, so much devil in the detail, right? Like, it's such a gray zone because someone might be like, oh, anyone that, like, used any kind of, like, video or social media, and some people might be like, well, it's only video if you did X, Y and Z. Like, you were in our video editor and you used advanced video features. And. And that's why it's like, like, it's just messy. Our jobs are messy.
Jacob Mattson
Totally agree. When I think about risk, I guess, like, I. I would put in a slightly different. I, I would not necessarily frame it as, like, analytics first, but I would just say that, like, I. I left a job. And then like a month later, someone on the team who was still there sent me a message and was like, man, it has been rough since you've been gone. And I was like, why? And I'm like, everyone knows all the things. There was nothing interesting happening. He's like, well, no one's making any decisions. And I was just like, oh, yeah, okay, I could see that. And so I think, like, some of it, like, when I talk about risk, I just honestly am just like, make a decision, right? Like, be opinionated on what it means to have a monthly active use or by product X, Y, X, Y and Z. Not sad. Sorry. You know, and like, you know, maybe that there's lots of interesting things that happen when you start breaking those things apart, right? And, you know, one thing that I think I was well trained on because I was in accounting is you get really good at, like, delivering bad news. Like, and so you're always in the. You're always one of the first objectives in accounting is like, you want this to be true. You want it to be. The numbers you're showing are a reflection of reality as you understand it in a way that is defensible. And sometimes when you're dealing with metrics, especially with product teams, they want to show that their thing is working right. And what you're. It's a tension, right, between the domain team and a central team, which is, okay, what is truth to the company and what moves the business forward. And I would almost always say that we want to be measuring things in a way that when they're tested against reality, they're proven to be right. And so when people are bringing agendas into things, like, hey, I want to define something in a way that says I get more monthly active users. Well, if that's not moving the company forward, that metric, when it's tested, is going to fail. And so how do we do that? How do we test them more closely against reality? Is a really interesting question. And the reason, the way we do that is by potentially taking bets and making decisions on them.
Michael Helvel
Right.
Mo Kiss
Can I, Can I. I'm taking us completely down, off topic, as per usual. I just want to push on this a little bit because I do see this happen. And I'm curious if your experience in accounting has perhaps given you confidence or like, you've built the confidence to sometimes have an opinionated decision. Whereas I feel like often in data land, sometimes there is this, like, desire to debate every which way something can be cut and like, maybe, like, make a proposal, but, like, not often enough be like, you know what, I'm gonna have an opinion here of, like, we're gonna calculate it this way. Let's do this, let's move the business forward. Like, do you think sometimes like that, is that your accounting background? Do you think that helps you have that perspective because you're more willing to have a position knowing it's the best, the best of where you can get to? Or do you think that's like, you personally? Like, what do you think's driven that willingness to take a gamble and have an opinion?
Jacob Mattson
I mean, I think it's a little bit of self selection. Like, part of why I liked accounting was because it let me do things like that or like, gave me a framework to reason about those things. Right. I think I'm probably a little contrarian by nature. And so, you know, when people, when I see people getting, like, too precious about metrics, I'm definitely just like, let's make a decision. Let's Figure it out and we will test it and if it's wrong, we can fix it. It right. One of the things that's great about analytics in general compared to, I don't know, financial numbers that you're publishing to your board or whatever, is that you have a lot more degrees of freedom in terms of what it looks like to go back and make something better. One of the biggest challenges in analytics is, hey, that number got put into our regulatory filings. So that's how we do that now. You cannot change that anymore. So obviously you don't want to take. I don't know if this happened to me specifically, but I'm sure. Well, actually, yes, it has. Where we made up some, some way to bin some set of data and then suddenly it was in, you know, annual reports and now it's like, okay, now we're always presenting that. And if we had known, if I had known, I think from experience would have been like, hey, let's be a little more precious about this. And I think, like, it's definitely a tough balance, but we don't need to be perfect, right? We can be. We as long as we kind of know, you know, and it's justifiable and we can defend it. I think we can go pretty far. But like, you know, there's risk, right? There's risk that you could be wrong.
Julie Hoyer
Do you feel like of all the context in a business, what percentage of it that people use day to day when creating metrics, defining metrics, doing their analysis, making decisions, what percentage of it is actually captured in a formal static semantic layer for broad knowledge compared to they're just doing it, adding in their own context and their own SQL queries and the way they're pulling the data.
Jacob Mattson
I mean, now that I work in marketing, it's a lot less than it was when I worked in finance. So it's contextual, I think. Yeah.
Michael
There are no generally accepted analytics principles, if you will.
Jacob Mattson
Yeah, yeah, sure.
Julie Hoyer
Yeah. I feel like it's a small percentage smaller than maybe people want it to be. And do you feel like people are always fighting to like make it as close to 100 as possible or.
Jacob Mattson
I think like the tension is that like everyone wants the risk to be low. Like, hey, if I'm going to use data to make this decision, well, then it better be right. The data better be right. And therefore I'm not going to use the data because I don't want to take someone accountability for someone else, you know, something produced by someone else. I want to take my own Accountability, you know, I think that's a core, definitely a challenge. Do I see it moving towards 100%? I mean, I think like if you're moving toward 100%, like you're just automating the entire function, right? I mean, I guess that's like Google AdWords bidding, right? Like, okay, the whole thing's automated. The price is the price. In some ways that's the fully actualized form of analytics, like auction pricing. Do I think that's the right way to do it? I don't think most jobs are not that straightforward. I think that it's probably a smaller percentage than most analytics people would want it to be and probably, you know, roughly around the right number for where, where things are today.
Michael
All right, I want to start to pivot into what we actually here to talk about, which was, sure, let's say you've been struggling with the semantic layer and you're running into all the problems that semantic layers kind of introduce. You know, they're inflexible, not easy to pull together, don't work well across different departments and teams. Like, there's lots of reasons why a semantic layer is a challenge and, and lots of people spend a lot of time it. But what are the alternatives? What are people doing to lower their dependency on the so called semantic layer that is sort of like the favorite of the AI world right now in data.
Jacob Mattson
I mean, I think ultimately what we're seeing a lot of right now is that everything is kind of going into the notion of skills, right? Which is just markdown on your laptop or in a GitHub repo or somewhere. And I think people are capturing a lot of context that way that they are ultimately using either personally or sharing inside their company. I think there's a lot of new service area here for products. I know like for example, inside of Claude, they have this notion called projects. And projects let you kind of put markdown in there and then link, link it to other things. And then whenever you ask a question, you can select a project and then you'll, you bring context along, you know, with it. So I think we're seeing that. I think we're, we're definitely seeing like vendors coming along in the space. You know, we're seeing, we're seeing a lot of like, we're also seeing like the perspective from the labs, right? The labs who have unlimited tokens are just like, oh yeah, we just like, you know, use AI to do, do everything. We're just like, you know, maximizing our token spend to solve these problems. I think the one that I saw that was, was killing me was like OpenAI has like a analytics agent and they, they basically say, all right, well we're just going to ask every question twice. So basically we'll ask, you know, the user will ask and then we're going to reformulate it and then have our background agent just see if it gets the same answer. And if they're too far apart, we're going to escalate it. And I think certainly that's one approach. I can't imagine it's cost effective for anyone at reasonable scale who's not a lab at the moment. We're seeing lots of ways that people do it. I mean, from what we have been working on at motherduck, what we've started to do is just put context in a database because of course we're a database vendor, so put it in database. AI is really good at writing SQL. It's really good at retrieval, retrieving the right thing. And so when you do that, then you can start treating it in a more structured way, whether that's more of like a graph that has nodes and edges that you can use to navigate across, or if it's just straight up comments on columns or whatever, which is an old part of the SQL spec.
Mo Kiss
I am very much feeling the semantic world bubbling at the moment and the pressure of it and the perception that it's going to solve a lot of our problems or the perception that it's required to do AI and data. Well, I think one of the points that you made that really resonated with me in the article was about semantic layers being static and how challenging that is. But I think the real thing that's keeping me up at night is if we don't go down that semantic layer path, it's the validation which you just touched on. Right. So the bit that's challenging that I see pop up constantly is if we don't have somewhere for people to self validate, I find that's a hard thing that I want to solve for because I don't want my job to be to QA other people's shitty outputs and AI hallucinations with data, which it feels like I spend some time doing now. I guess what I'm trying to say is at the moment I feel like it's binary that you either have some type of semantic your thing where people can validate or you go down the evaluation framework. Am I totally off here? Is it one of those binary options or is there just a range of options and I haven't thought deeply enough about it yet.
Jacob Mattson
I mean, I think the first question is about the interface, right. That you let users interact with, right. If they're interacting with a spreadsheet, there's different set of constraints than if they're interacting with a BI tool, than if they're interfacing with like a chat app, right. There's more engineering freedom I think, think on obviously a chat application, which is what people have proven to love just asking questions to a chatbot than there is on a BI tool. I think that one of the biggest challenges on using even the best BI tool in the world at this moment is that it's very difficult to interface with something else that someone else built. I think one of the things that I've really found when using AI generally is that the closer the framework you're using to answer questions is like a snap fit to your own brain, the easier it is to use it and use it well. When you're using someone else's model, even a really well defined model by a really good engineer or a really good BI analyst or whatever, it's like that's their model, that's not your model, that's not necessarily how you're thinking about the problem space. And the biggest challenge that LLMs solve is they're really good at translating between languages and that really means they're really good. For me as let's say someone working at Marketplaces marketing to ask a question and then have the LLM reframe it to be like, oh, I see what this really means is you know, it means this and you know, that translates to this language in your current model or whatever. And so I know I'm not really kind of answering your question. I don't know like, I think there is a spectrum. I don't know if we know if like I think the products are not super mature as are outside of the semantic layer. Because the, the semantic layer buyer I think traditionally has been very risk averse. That's why they're by. The reason you buy semantic layer in the first place is because some number got somewhere and it was wrong and legal's pissed. That's how you buy a semantic layer. I'm being maybe a little too cynical, but only maybe so I think figuring out how do we again, I keep saying this word risk, but it all comes down to how much risk do you accept and that determines the spectrum of tools that you can implement. If you can take, take on a lot of risk, like in marketing analytics, you can probably take on more analytics, more risk than you can in financial analytics. That's just true, right? The cost of being wrong is way lower in marketing than it is in finance. That's just true. Like that's, that's a physics problem. So I, I think like, you may not have the same solution across the entire company. It just depends, you know, horses for courses, as they say.
Mo Kiss
I suppose, just to be clear, I will keep asking Jacob 50,000 questions. I'm trying to make space for other people by not spe.
Michael
That's okay. And we, we edit out dead air. So don't, don't worry about that.
Jacob Mattson
That's fine.
Julie Hoyer
I wanted to ask actually, Jacob, if you could talk about your proposed solution in your article that you talked about, like using AI, because everybody's obsessed with having a semantic layer to help with AI, but you kind of flipped it and said also AI could help with the semantic layer problem.
Jacob Mattson
So when I wrote the original article, what I was really thinking about at the time was the notion of, of skills and using skills kind of locally on your machine to kind of codify the way to go here. I think I've gotten a little more nuanced recently, but also I think that we've seen a lot of maturity and development, especially like in the anthropic ecosystem with projects inside of Claude, for example. I think it's a very natural way to think about it is like, how do I make it easier to maintain or actually create and maintain. And skills are incredibly, incredibly easy to create and maintain. Maybe too easy, right? They may not be the right abstraction, but certainly I think skills is the way that I thought about it at the time. And then I just, I've been doing kind of evals in that front against benchmarking data sets. And we can debate the efficacy of benchmarks versus real life data and all these things. But I think what we do know is true is that if we can find the right context, we can very reliably return the right, right answer. And so I think like, where my. Where. I think my. Where my thinking has gone recently is how do we make our data easy to search, Right? Which is a different way than maybe we've designed it before as like a Kimball model, which is like, how do we make it easy to retrieve? You know, there's a whole bunch of really good research that those guys all wrote around how to make a data warehouse and how to make it work and easy to retrieve. And those were written in the constraints of the time time, which I think probably the original Kimball book is probably going on 25 or 30 years now. We have better technology now and maybe we can revisit some of those assumptions. And so I think my current kind of position on this stuff is that we really have a search problem. And if we can find a way to make our metrics searchable, then the way that we define it may be less important. It might be a semantic layer that's in, yeah, YAML. It might be something that's more well defined than that. It might be something more programmatic than even YAML that is kind of like a SQL alternative. But what I have also found is that LLMs in particular are so good at writing SQL and understanding it that adding another language that is new and bespoke is really hard to get good results out of compared to just using the trusted good old thing that is very verbose and has weird syntax and has a whole bunch of down synthesized sides. But also there's 50 years of training data in the training set, right? So if we can figure out how to say, how do we make it search to find the right thing and then write the right SQL? We can get really good results. And I think that's what I'm seeing is most promising at this moment.
Mo Kiss
Can we separate search from the context? Because I feel like those are two very different things and we struggle with both. So search is more about like, how do I point it in the right place, how do I help it find the right table or whatever it is, right. Whereas the context is more, how do I understand this table? Okay, do you think that what you're proposing, the way you're thinking about this, solves for both equally? Or do you think perhaps, I feel like we probably need a different approach for each or different thinking.
Jacob Mattson
The answer is, is the devil's in the details? I think so. Let me. I think where I'm struggling with this question is like if you assume that there is a natural language interface, right? And then maybe some sort of tooling like an MCP or something that has a search tool. Well, that's, you know, someone should solve the search tool problem, right? Which will say, hey, let me search and find this context and return it to you. And the next thing is writing the SQL based on that Understanding. Understanding. I think the second part is basically solved, which is if you give good context to an agent and say, describe a set of tables and then ask a question, you get really good results. And the challenge we have, of course is that they don't have good memory. And so every day we have to Remind them, so how do we make it so they have good memory? And there's a whole bunch of people I'm sure working on that problem. I've read quite a few papers in the space of how do we make it so when we ask a question, the next time we ask it, it's faster to get the answer. Answer or we already know we already have some kind of pathway built out. The metaphor I kind of think about is your data is like a jungle, right? Unless you really know where the things are, it's really hard to find stuff. We can build a semantic layer, but a semantic layer is like a highway, right? It's like we are just building the thing straight through the jungle. I would almost say, like we can build it a little more progressively, which is, hey, we need these little paths, right? Maybe these paths are just wide enough for a human to walk on. And maybe this one we can put gravel on and maybe this one we can pave, and then maybe this one we can build a highway. So I kind of think like, I think this actually goes back to the spectrum question you asked earlier. There's probably a spectrum of solutions here where a certain set of questions need to be on the highway, right? Like if one of your key metrics is monthly active users, that needs to be right all the time. And there's a highway for that, right? And so the search problem is like, can my agent find the highway? Okay, and if he can find the highway, then you get the answer. And then, then there's a bunch of. But the interesting stuff, the reality is, the interesting stuff is when you start slicing and dicing by all these arbitrary dimensions and then there's no highway for that. I can't afford to build that even in the age of AI today. And so how do you make it easy to do a little bit of off roading and then still get the right thing? And I think this is where, I think I'm in particular probably well served by accounting principles as I'm building this stuff out and I'm just like, well, what if we just made this a ledger and we can just walk forward and backwards through time or whatever. Now it becomes very easy to trace it back to the highway. There's a whole bunch of abstractions like that we could talk about, but I think the answer is probably and not. Or in the long term, I do think that there's probably some set of questions that are so important that you need to always get the right answer. And that might mean maybe a different interface, like, hey, for Our financial reporting interface, we use X. For marketing, we use Y. I don't know.
Mo Kiss
Do you think it depends on stakeholders too? I'm just thinking. So, for example, if you're a data person, the context is something that exists within you. You have such good context, so you know how to prompt, you know how to give the right context. And I'm talking as you're like in the build stage and this is evolving, right? Because if we have essentially, you know, you're kind of suggesting like an evolving semantic layer or knowledge base, graph based. What, at the, the earlier stage, right, when that context isn't fully built, what do you think the risk is for the business user to be exposed who potentially doesn't have that same content? Do you know what I mean? You haven't built enough of the highways yet.
Michael Helvel
Yeah.
Jacob Mattson
Such a good question. So when we first were. So we launched our MCP server in December at Mother Duck and we had our own existing set of dashboards for the sales team. And immediately what started happening is the sales team started building their own kind of dashboards using our tool, which is really interesting to see. We're, I mean, we're a small company where our risk threshold is obviously higher. But what everyone was doing was because they already had really good context for the data because, you know, their sales team, they're, they. If the data is wrong, you know, everyone's mad. It doesn't work like it's tested against. The sales data in particular is tested against reality all the time, right? Because you're on a call with a customer and you're like, hey, I saw you used, you know, 10 hours of compute last week. You know, we sell compute. And they're like, no, we didn't. You're going to be like, okay, let me go. Oops, let me go fix my dashboard. But we. So they have. The stakes are pretty high for them, right? And for financial teams too, they're really high, right? They need to be right. They need to be speaking about the right numbers. And so for teams that have developed an intuition for what the data should be, they're actually pretty good at using LLMs regardless of language and understanding, like being a data person first because they have a way to test it against reality really quickly, right? They can break down a set of numbers by, you know, like if you're a CFO or something, you could probably break down revenue by product and you would with an LLM and be like, that's right. I can tell that it's right because I've seen this product before, right? You already have like the fingertip feel for the data. But if you're someone else, especially like, this is the hard part for like a data analyst who's maybe disconnected from the business is they don't have the fingertip feel. They just have like a question from someone that says, hey, build me this thing. It's really, you know, in that case, you really have to, you know, rely on the context of others to help build the right thing. You know, I think like, that's so, yes, I think the stakeholder does matter. You know, I think that certain teams are more well suited to be data driven than others. Just like out of the box. But like, that we can definitely close the gap with like, good context. Maybe not for everyone, right? If someone asks a totally off the wall question using the wrong words, like, that's a, you know, we don't have a crystal ball. We, we have, we have, you know, vectors that we're matching at the end of the day, right?
Julie Hoyer
With this like, approach that you're talking about, like using AI to kind of bring up the context for somebody who doesn't have it themselves. I have two questions. One, and I think Mo, you were kind of asking this. So it was like, how do you know when it finds context that it's choosing the right context? Because again, we've talked about, people are defining and talking about the same metric differently, especially if it's not one of the highway metrics. The other part is it feels like classically the legwork, right, was on the individual trying to ask the question, make the query. They had to go around and obviously ask and get all that context.
Mo Kiss
But with the new.
Julie Hoyer
If we put in the solution that you're talking about, where does the legwork now live in that process? Do you know what I'm saying? We've shifted the hard work of determining what is right for the question you're asking of where to grab it, how to think about it, how it's defined, and how to define the metric you're trying to query. So I'm trying to understand those two pieces.
Jacob Mattson
The first thing that I always think about is, is like, well, how do we make it visual? If this was a video podcast, I could show you a really sweet demo, but maybe I'll just have to send it to you. I'll have to send you a video async that. That shows you kind of one way that it could look like. So I think like, the first thing is like, how do we make it so that interacting with the context is not just like typing into a box and then, you know, stuff happens and you get an answer. We need to make it like, you need to be able to understand what that looks like. And I think, I think for a lot of this, because I spent so much time in like the Excel salt mines, I think about it a lot. Like I think about Excel, which is you have. Have some tab somewhere that says, here's this metric, right? Here's our profit for last quarter. And there's a little button in there that shows trace precedence. So, okay, where does that take me? Okay, that takes me upstream and then I can keep navigating all the way back until I kind of understand how this thing comes from. And then there's another little button on there that says calculate formula.
Mo Kiss
Right?
Jacob Mattson
And it just shows me all the little parts and it shows how they're adding and subtracting and dividing and multiplying and how it ends up with the final number. I think we don't have those traceability pieces yet for agentic workflows or semantic layer stuff, really. I mean, we do, but only for the developer, not for the user. And so what I'm really thinking about is the way for me to manifest this notion of, hey, you don't need the semantic layer. It means you need to have something else. And part of it is, yes, you need the context, but you also need a way to visualize it in a way that people can reason about and understand and agree with how it was calculated.
Mo Kiss
Right, but you're saying visualize the lineage though, and the way it's calculated.
Jacob Mattson
I think all parts of that. Right, Lineage, yes, but less about. I would love to be like, okay, this number comes all the way from this field in your CRM. I would love to be able to do that. Right, okay. Or some sort of way to say, hey, talking about one of the active users earlier, here's all the detailed ways that that's calculated. I think that's probably actually too hard to reason about. It's the wrong. It's too detailed. Right. We need to kind of be able to get the proper level of zoom out. Right. We don't want to be at the 10 foot level, we don't want to be at the 40,000 foot level. We want to be at like the 10,000 foot level. And I think it probably depends on Persona too. But like, you need some sort of way to be able to reason about those numbers and how they were calculated to be confident that they're correct. And SQL is of course one way to reason about it. But SQL is verbose and kind of hard to reason about as a non technical person. And even engineers hate using it. And there's a whole good set of reasons for that. I think the magic of Excel continues to be demystifying how complex some of this stuff is and doing that with really clever UI interactivity. And so what tools, what abstractions do we need to build to make that work? And I've been working on that thinking. The core thing that I've been starting with is just how do we take something like a set of tables and interweave context in there to show us how the tables relate to each other? In some ways you'd be like, hey, that looks a lot like erd, right? It's like, here's the primary keys, here's the foreign keys, whatever, right? But we can add a richer level of that that says, okay, we calculate these metrics this way, we use these joins, we use this formula. That stuff is also a level of the graph that's always been missing. When we get the erd, right, we don't know how it's. We just have a thing that says, here's how the database defines the table. What we don't have is knowledge of how the application actually uses it. And so that's, I think the next part too, you know, we can mine like query history. In fact, I've done a bunch of work around that. Like, how do people actually query these? How do databases or how do applications query these? How do those patterns differ when it's a human versus an application? So there's a whole bunch of stuff we can do there. I'm not trying to be like too abstract or obtuse here, but like, I truly believe that part of it is just that we have all of the Lego blocks, but no one has assembled it into something that is cohesive in a way that's like, I totally get this now. It's very much like even if you're a high ranking executive and you get some number, you're trusting that your team built it right and there's not a good way today and Lineage is part of that. But just being able to say, all right, how do all these components fit together together? And like fitting that in your brain as like, like, I don't know, maybe if you're the cmo, you shouldn't do that. But like, I don't know, maybe you should do that sometimes. I don't know. Hopefully that's, hopefully that's helpful. That's kind of just how I'M thinking about it at the moment, but like, make it visual is number one.
Julie Hoyer
I was going to ask how do you fight the echo chamber effect too, from some of this. Sorry. When you were talking about like using AI to search for queries, like if it's just always bringing back historical data and you were to like move that into the future, how do, how do you do that?
Jacob Mattson
Okay, so let me just make sure I reframe this question. So you're basically, you're basically saying like, how do you not overfit the history?
Julie Hoyer
Because they were figuring it out. Some of it's not right, some of it is. Or you know, it's just the bias of like what was right at that time. And you know, how do you bring these things in? You have people that aren't going for the context the old fashioned way. They're relying on the tool to give them the context. But it's old context, you know.
Jacob Mattson
So I think the first thing that I'm thinking about is like, well, if you had a semantic layer, you only have the history. You don't have anything going forward because someone curated and built that for you. So now we get a new problem to solve, right? Now if we can just forget that the semantic layer exists, or maybe we have it and it's, you know, it's the highway, right? But we want to detect when there are changes, right? When there's drift, when there's drift in our metrics, right? How would we, how do we detect that? I think this is a really good question. I don't know if there's like good programmatic ways that I'm like, aware of, like off top of my head at the moment, but certainly like, there's probably ways to do it programmatically. And I think like also some of it is like, well, if there's less humans, you know, doing, doing some of the, some of the work that's really easy to automate with AI. Well now what do those humans do next? I think part of it is like, yeah, maybe we need like a curator person who's like handling this context potentially. Right? You know, I think I kind of like always jokingly talk about like, hey, we need like more librarians, like we're generating all this context all the time and it's like, what do we do with it? It's like, I don't know, it just like lives in Slack or like teams or our email DMs on WhatsApp or whatever. And eventually, if it's like, eventually it makes its way into the canon of the organization. Right. But that time can be really long. How do we make that tighter is a really interesting question. I do think it's like the other side of that is like, well, what about archival, right? What if a metric was right and is now wrong? How do you discard it and manage the lifecycle of it? So I think all of those pieces in my mind kind of fit together with which is life cycle management of this. And it's like we never even got there because we just get to semantic layers published and then it's such a big lift to even get there that it's just like, okay, I got promoted. Thank you. I'm gonna go to another company and I'll do this again. This is now your problem. Sorry. And again, I'm being sort of facetious, but these are hard. Everyone wants to build it, no one wants to maintain it. And I think how do we.
Julie Hoyer
We.
Jacob Mattson
I think we, you know, there's probably space for companies or multiple companies in the space to build products that help us do this better. But, you know, I don't have anything on top of my head that's like, well, here's how you catch drift and like reimplement it. You know, I wish I had the answer, you know, would be. It would be more compelling.
Michael
Maybe next time. Next time.
Julie Hoyer
For next time, right?
Jacob Mattson
Let me do more. That can be my next set of research. We'll see if we can do it.
Michael Helvel
You know, we are.
Jacob Mattson
We are fully migrating our internal stuff.
Michael
So if we wanted certainty, Jacob, we would have just interviewed Claude. Right. So this is great.
Jacob Mattson
Yeah, exactly.
Michael Helvel
Exactly.
Michael
This is awesome. Thank you so much. Really excellent conversation. One of the things we do at the end of every show is go around the horn and share a last call, something that might be of interest to our users. Jacob, you're our guest. Do you have a last call you'd like to share?
Jacob Mattson
I just read a really awesome paper that I will share a link to you to Michael called Skill Executive Strategy for Self Evolving Agent Skills. I saw it on Twitter this morning. It is really interesting. I'll share it and you can put it in the show notes. Definitely worth a read. Just to understand, you know, what it looks like to actually apply some of this stuff about exactly what we were talking about. How do we evolve these things as the systems change.
Michael
That's awesome. I love it. Awesome. Julie, what about you? What's your last call?
Julie Hoyer
Okay, my last call is a little off the wall, but it was just too timely. Okay. So Mo, this is a little bit A fun fact about me and a question to you, because it has to do with Canva. Okay, fun fact. I don't like stuff. Squirrels. Like, I just. I don't like them. I don't think they're cute. They creep me out. Okay. Yeah, they kind of scare me. They kind of scare me because. Quick backstory. My dad said, you know, to this little, little girl, if you see possums or raccoons in the daylight, they're rabid. I thought that meant squirrels. I went around for years thinking squirrels were rabid. You know how they pop around trees when you're riding your bike? They, like, hide on the other side. I always thought they were gonna pop out and get me. I don't like squirrels. Okay, Fun fact that I don't share with lots of people. I get an email from Kids Canva that is squirrel themed. And the CTA at the bottom of the email said, if you would like to stop getting squirrel themed content, like, opt out here. And I was like, has AI personalized too far? Or, like, is this. Is this a phenomenon out in, like, the trendy world that I just have not been exposed to? Like, are squirrels a thing? Or do they know this about me? So it kind of freaked out me out. And there's my fun fact also.
Mo Kiss
I'm like, is squirrel a cool, hip thing that I also don't know about? Because, I mean, that level of personalization.
Michael
Well, if the other cohort was getting emails about Moose, then there's something. Now you're getting somewhere. But that's probably a reference that I
Julie Hoyer
don't know if anybody knew.
Mo Kiss
Did anybody else get this follow up? And we can. I can share an update with you in our next last call. Really keep people hanging.
Michael
Well, there you go. That's a good last call. That just shows the range we can have here.
Jacob Mattson
Amazing.
Michael
All right, Mo, what about you? What's your last call?
Mo Kiss
Okay. I have been reading a book which Rachel Gearson recommended. It's called Word A Feminist Guide to Taking Back the English Language by Amanda Montel. And I just like, you know, words are important. Like, you do. But when you go through this book and you hear about the evolution of certain words and how they refer to women, it just like, it's. I don't want to say delightful. It. It's been surprising in a really great way. Like, I feel like it's kind of added to me wanting to be more thoughtful about the word choices that I use, which is entertaining because I don't give too much thought to what comes out of my mouth. So anyway, go check it out. And Michael, over to you.
Michael
Yeah, so mine is from back in May. James Hawkins, who's the CEO of posthog, wrote an article about what he was think about as sort of their next chapter of Vision of the Future. Just so if listeners aren't aware, posthog is sort of both an analytics tool as well as other tools for digital and website operations. Anyway, I don't know that I agree with everything he wrote in terms of, like, where products should go, but I thought it was very thought provoking and definitely worth some time because all of us in analytics and measurement spaces, we're facing a lot of change and it's good to see like, okay, here's a company who's got a lot of customers, who's doing a lot of work, especially with AI. Here's how they're looking at the future. So I think it's kind of a good read just to develop and gain perspective. So that's why I would recommend that. All right, once again, Jacob, thank you. Thank you so much for coming on the show. This has actually been a really cool conversation and, and kind of like we, we, we stayed pretty high, but I feel like there's also some really amazing kernels for people to pick up on and drill into their orgs with that I think will actually bear a lot of fruit. And also because it relates to songs like, so now in my head I'm like, life is a highway.
Mo Kiss
Okay. Oh, wow.
Michael
Yeah, See, that's what, that's what happens to me when we talk about stuff. Anyways, so thank you again and obviously to our listeners. Yeah, I'm sure you've got thoughts and questions and we'd love to hear from you. And there's a great way for you to do that. You can reach out to us on LinkedIn or on the measureslack chat group, or via email. Contact analyticshour IO and wherever you listen. You can also leave ratings and reviews, and we read all of those as well. And Tim Wilson, who's not here today, would like you to know you. You can also request a sticker for your laptop. Just go to AnalyticsHour IE and there's a form that you can fill out and we will mail it to you even internationally. All right, this has been great. I think there's about three more of these that probably need to be done. AI is changing so fast. But Jacob, thank you once again. And I know I speak for both my co hosts, Mo and Julie, when I say no matter where you are in the Jungle. Keep analyzing.
Jacob Mattson
Thanks for listening. Let's keep the conversation going with your comments, suggestions, and questions on Twitter at analyticshour, on the web at analyticshour IO, our LinkedIn group, and the measuredchat Slack group. Music for the podcast by Josh Crowhurst. Those smart guys wanted to fit in, so they made up a term called analytics. Analytics don't work. Do the analytics say, go for it, no matter who's going for it.
Michael
So if you and I were on
Jacob Mattson
the field, the analytics say, go for it. It's the stupidest, laziest, lamest thing I've ever heard. For reasoning in competition, I think actually, like, if you moved your cable around, I wonder if the cable was just, like, giving you some weird feedback or
Julie Hoyer
something, because my charging cord and my beer.
Mo Kiss
Touch it.
Michael
Now don't touch it. Hands up.
Julie Hoyer
Tim sent me a mic that he knew made a buzz. That was, like, an evil trick.
Michael
Sabotage.
Julie Hoyer
Yeah. My toddler now uses it as her microphone. Just, you know, unplug for fun. And I was like, sure, you can have this.
Michael
Does she do a podcast because you do one? Because that would be the most adorable thing in the entire world.
Julie Hoyer
No, but she does like to sing. Oh, she just breathes into the mic.
Michael
That's what I do.
Mo Kiss
I'm gonna be triggered a lot with every time the word, the S word is mentioned.
Michael
You know, usual, I do the exact same thing. Mo. I look over because a lot of times I leave crap on my chair back here. And so then I look, I'm like, oh, clear.
Mo Kiss
I don't. My husband gets dressed in here in the morning and decides to leave whatever. Whatever pants he didn't want to wear in the background.
Julie Hoyer
Trust me, you don't want to see what you can't see behind this cover.
Michael
All of the mess is my response.
Mo Kiss
I do get asked very frequently, though, why I have so many remotes on the wall behind me. And I'm like, it's a fair question. Well, there's another one that's missing, so it's probably for life.
Jacob Mattson
I don't know.
Michael
You're. You're just a very successful person, and if people can't deal with that, I'm just. I'm afraid to do it. I'd be afraid of that level of success is what my answer.
Julie Hoyer
Five remotes.
Michael
I know I don't have the lifestyle governance required. All right, let's get into it. So I'll give us a five count and. All right, let's stop moving our mic.
Mo Kiss
That's the problem with a YOLO one at, like, everybody.
Michael
That's right. That's right. We're very serious.
Julie Hoyer
Rock flag. And if you build it, who will maintain it?
Date: June 23, 2026
Hosts: Michael Helbling, Moe Kiss, Julie Hoyer
Guest: Jacob Mattson (Developer Advocate, MotherDuck)
This milestone 300th episode of The Analytics Power Hour tackles a hot topic in analytics: the “semantic layer.” The hosts and guest Jacob Mattson (MotherDuck) debate whether semantic layers, which promise to standardize definitions and act as the bedrock of AI-powered analytics, are truly necessary—or if more flexible, contextual, or AI-driven alternatives might better serve companies. Drawing on decades of data and accounting experience, the conversation dives into the trade-offs between standardization vs. business flexibility, the evolution of analytics tools, and how AI is reshaping the challenges semantic layers were designed to solve.
The conversation is lively, candid, and full of real-world experience—balancing tech skepticism with optimism. The hosts and guest speak with authority but also humor and humility, making it accessible for analytics professionals at various levels. Moe’s analogies and the group’s willingness to challenge each other fuel an open, probing dialogue.
This episode challenges the analytics orthodoxy that semantic layers are always essential, proposing that AI, better context management, and even a more flexible, risk-based approach might deliver more value—at least for many business problems. Listeners are encouraged to rethink static definitions, embrace experimentation, and stay mindful of the nuances between company-wide standards and the “bespoke” metrics that drive innovation.
Jacob Mattson’s insight-rich participation (and strong analogies) provide real-world grounding to a theoretical debate. The team closes with the reminder to stay curious as the analytics landscape and its tools rapidly evolve:
“No matter where you are in the jungle, keep analyzing!” —Michael Helbling (54:30)