Loading summary
A
Foreign. Analytics topics covered conversationally and sometimes with explicit language.
B
Hi, everyone. Welcome. It's the Analytics Power Hour. This is episode 288. You know, we spent the last decade putting walls around our data, securing it, governing it, putting labels on it. And now the AI revolution walks up and it's like, hey, can I see all that? Today we're going to discuss Model Context Protocol or mcp. I mean, it's an open standard, promises to stop all the copy paste madness and let AI talk directly to your data systems. Is it the end of the data silo or just the beginning of a new governance headache? Well, we're going to try to establish an MVP for PMF of MCP all in one hectic hour. All right, let me introduce my co host, Val Kroll. How are you doing?
A
I'm good.
C
This is an interesting one.
B
Yeah, all my acronyms. Yeah, that was fun. All right. Tim Wilson, always a pleasure.
D
Likewise.
B
All right. I just use the acronyms to make myself sound smart, that's all. And I'm Michael Helbling. All right, well, we need a guest, someone to help us dive into this topic and we've got a great one. Sam Redfern is a staff data scientist at canva, currently working on search and recommendations there and previously marketing measurement. Prior to that, he has held data roles at both Meta and iag and today he is our guest. Welcome to the show, Sam.
A
Thank you very much, Michael, Tim and Val, really excited to be here. First time caller, long time listener.
B
That's awesome. Well, we'll ask questions and take our answers off the air. No, no, Sam, I'm excited to talk to you about this because obviously all things AI are very of the moment and everyone sees the term mcp, but I think if we just take a step back, maybe you could just fill us in on what exactly MCP is. Model Context Protocol, where did it come from? Give us some background on the whole concept to kind of establish the conversation today.
A
No worries. So super pumped to be talking about this. So let's take a step back and think about what this is solving in a sense. Right. We've had access to these large language model tools for a little while now. Right. And so in the early days of GPT2 and GPT3, before ChatGPT, these things were kind of like word calculators in a sense. I really like the analogy that it's this sort of like, it's like you put the numbers into your calculator and you get the equation out.
B
Right.
A
This is the same for almost Words like early large language models acted like that. And the innovation in the OpenAI space was to basically feed the output back into the input and make this resumable kind of format. And what's interesting about this whole. So let's step back from MCP and branding and letting technical teams come up with the term for things, because this is how we got NFTs as a term at the same time. But the core problem to be solved with these is that we've got something that feels a little bit like a person in the sense that you will give it some words and it will respond with some words back. And could you give that agent or that large language model the ability to do something other than just converse? And so the first application of tool use in a sense in large language models actually came from the LangChain, the open source LangChain team. And for those who don't know what LangChain is, is it's a framework for building agentic experiences. And so you can your anthropic model or your OpenAI model or whatever you want and then allows you to piece together bits of technology to add context into the large language model to try and get an output. And so in April 2023, PR was submitted to the Langtraig project, allowing it to open up for the large language model to take a browser URL and to go have the backend application request the contents of that HTML and then bring it back into context window of the agent itself. And if you think about it as like the core thing that it's trying to do is it's trying to get this. I kind of think about it as fingers in a sense as like it's trying to give the large language the ability to touch something, a bit of information, bring it closer to it, for it to understand. And that's at its core what these MCP tools are is MCP is like a brand term through anthropic. And to say it's a standard is to be very generous. But I'm really bullish about the concept of giving these large language models access to tools for them to be able to solve problems.
C
I have to admit that I really did think that MCP was just the API for LLMs. But the more I was looking into this, it was understanding that those fingers like you used in your analogy, is really giving it more access than just here's this endpoint and it's just a one time thing. Can you talk about some of the things that you give it access to with those fingers or to grab. To kind of give a little bit more color to what it actually does or what it. What it's capable of, if that makes sense.
A
Yeah, absolutely. So this is sort of in the weeds of how they work. Just to sort of. I think the context to understand is like, if you've ever done work inside of large enterprises and you've tried to create an application, access username and password, it's a total pain in the butt to go get this thing. And then your infrastructure team is like, well, you got to change the password every three weeks and then you have to have a cryptographic token to do something. And it's in some sort of this. And the reason why. So actually, Anthropic had a previous attempt at tool use in October 2024, a month before MCPs were announced, where they had a system that would take over your browser and move your cursor around. And the reason why MCPs are running on your local computer is that your user account has access to all these systems. That's why the early paradigms of these systems are as close as possible to the end user's system. So the analogy I kind of give on the fingers, in a sense, is inside these MCPs, you can give it access to any number of the standard, allows you to use representatives, any number of tools, and you have bits of information, right? So when the agent starts, it's basically given a list of all the tools that the MCP server has available available to the agent. And so it has the name of the tool. And so let's just do a really simple example of. In our agent environment, we have two tools, right? One of them is called a saw, and the other one's called a drill. Right? And so in our saw tool, we would describe the name as saw. We would say the description is it cuts wood in a single direction across a line, and then the inputs to that is the position of the wood and the depth, right? So that is a finger, for lack of a better term. Now, now that allows you to. That's our first tool, and our second tool might be drill, right? And so that allows you to drill a hole through the piece of wood.
C
Okay. Okay. So I have to ask one more. Sorry, I'm hogging the air, but I guess the one other thing that I'm struggling to grasp a little bit is what was the need for standardization of this, the protocol? Can you talk about what that is solving? Because what you shared in that analogy was great.
A
Absolutely.
C
Going to use that. I'll give you credit every time but like, why was there a need to standardize outside of like, you know, enterprises, you know, would feel more secure with that, or, you know, the governance would be easier. But is there any more, more to.
A
It than just that piece in the adoption curve? We are so far away from the government's governance piece on this stuff. Like, there's, there's a bunch of companies right now that are trying to put governance around these systems. And I'm sure at some point we'll talk to maybe some of the potential downsides of this stand standard, if we want to call it that. But the reason why Anthropic went down this path is in the technical details around how the LLM is trained. They have been doing this work of training the large language model to use this special escape set of characters. So when the large language model is like, okay, I think what I need to do is use the saw tool and then it has this string of characters. Saw tool, string of characters. And that indicates to the agent that's hosting the large language model, okay, I have to take the text below this and send it into the tool itself with the input parameters that it needs for it. And so Anthropic had this huge lead because they'd done the work of training their large language models for tool use and using their reinforcement techniques to basically say, this is what you have to do. And this was this huge lead that Anthropic had for a couple of years, in a sense, right? Since, well, I mean, everything feels like it's a couple of years. It's really about eight months, right? Until other people started trying to solve this problem. OpenAI had their own sort of core procedures kind of method. I think they were called functions. And it was a very similar kind of thing. Anyone could have come up with a standard. The core problem they were trying to solve is how do you give the large language model a hand for it to basically make decisions about what information is pulled towards it or what information or what actions it takes when it's pushing out.
D
This is slowly coming a little more into focus and still pretty damn fuzzy for me. So I know recently it seems like there's been a lot of chatter about Google Analytics having an MCP server. Is that the right terminology? And that is something that the Google team said. We are going to produce this to basically make the. This is a saw, this is what it does, these are the inputs. And it's just a much longer. I mean, your analogy was very, very simple. Is it as simple as that? Which has me going back to saying, well, when Val said it's like an API for LLMs, it sure sounds like an API for LLMs. And I'm missing where that analogy is breaking down.
A
I think you can use it. There's a lot of analogies of talking about these MCP tools as being the early days of APIs and stuff like that as well. I think there's an extra bit of the near direction of where these systems are moving, which is more interesting in the API part. But just to come back to that, the way I kind of think about it, APIs is a great way of talking about it. And there's lots of people doing weird fun things with these tools right now where, if you remember, I think some of us on the call are old enough to remember the early days of web 2.0 and people were making APIs for the weather and it was open and everything was fine and we were very far from the standardized way of. We think about this sort of stuff now, right? The way you design an API is very standardized now. I think the thing that's different is one, we're dealing with this huge amount of non determinism and we're dealing with all of these different terms and terminologies that exist. So I think everyone on the podcast might have heard of the term agent, right? And so an agent is the idea where you have a resumable output, you have. You have some text that is the system prompt, and then you have sort of this resumable kind of conversation. There's another term that's being formed right now called a harness. And a harness is an idea where you have an agent and you have a tool plugged into the side of it and then that has like a domain of knowledge attached to it at the same time. And so cursor is an agent, the claw desktop is an agent. Sorry, cursor is a harness, right? It's got access to all these different tools. I think the. I actually think of NTP and where it's at right now is more akin to these digital document formats like xml, right? So we started with xml and the number of people who are writing XML these days is almost none. However, the amount of change of this standardized document format then brought us to JSON and now it has fortunately brought us to YAML and Markdown. And like we are at the XML stage of this development is that this is going like tool use, conceptually attaching to large language models through agents and harnesses, that is going to stay for a long time, whether it's the MCP standard or someone comes along with a better standard, then we'll see how that goes.
B
You know how developers got the AI engineer role? It's time for the rest of us. I think we're witnessing the rise of the AI analyst.
D
Okay, does that just mean asking a chatbot to do math? Because I have Excel for that.
B
Michael. Well, no, Tim, I'm talking about Ask why? It's Full Stack Analytics. I ask a question in plain English and the product Prism orchestrates the whole thing. You can pull in data From Excel or BigQuery.
D
Hold on, you're sending BigQuery data to an LLM? Security is going to have a harder track.
B
Well, that's the best part. Ask why doesn't upload your data?
D
Explain.
B
It creates a semantic layer. It sends the context to the LLM. The LLM writes the code and that code runs locally on your data. Your actual numbers never touch their servers. So it's totally traceable. Huh.
D
So I get the automation, but my data stays safe and secure.
B
Exactly. Plus it remembers context. So as you automate routine tasks, it stores those so you don't have to explain it all again the next time you do that same task.
D
Okay, I'm listening. Where do I get it?
B
Well, it's in beta right now and you can go to Ask Y AI. That's Ask Y AI. You can get ahead of the curve and join the ranks of the AI analysts.
D
And because we like you guys, use Code APH when you sign up. And our friends at Ask why will put you at the top of their wait list.
E
Yep.
B
Stop pasting data into black boxes.
D
Get Ask why that. I've had the XML question as well as whether it was because I remember that being coming from an HTML world and then XML came out and it's like, look, XML doesn't give a shit about what you're rendering in a browser, but it is this structure world. So I feel like. And then JSON I sort of understood because of the xml. And that makes sense because there were. There was talk of saying different applications uses would say using this XML structure. Let's define kind of specifically how that is going to be used in the context of this financial services thing. So you're saying that is also a useful if imperfect analogy.
A
Yeah, look, history sort of rhymes more than copies in a sense. Right. It's going to. It's the first time in software that we've had this amount of non determinism to deal with. Right. You think about what success has meant in software development or data before and it's like some human's ability to remember some random function as part of a library and be able to write that code as fast as possible and do it in as perfect, as close to grammar as possible. The problem with this new world that we're going into is the skill of dealing with non determinism is not closely overlapping with that historical set of skills. And so it's going to feel different. And people who talk about the vibes of a model, there's some truth in it, in a sense, right? Getting a feel. And it's true with tool design. So when you're building an mcp, so we'll just go back to the MCP sort of paradigm and so you're thinking about the tools that you're building and the fingers that you're giving this agent and this harness access to. When you're starting out, you're doing kind of the early playful part of programming in a sense, right? Where you're just like, oh, does this connect to this? And when I run through it, what problems do I see with it? I do think that it feels like those early days of these standards and people are playing around with them. And then when you want to get serious about the thing that you're building, you're then looking through the agent logs, you're seeing what tools it's calling, you're seeing the parameters, you're seeing how many times it correctly passes in the correct parameters and everything like that.
D
So let me hit the non determinism point and maybe I'm going to go back to using the. The Google Analytics MCP is a example that the client, the application that's hooked, the agent that is hooking in and using the mcp, say it's a LLM at its core. So it's. And maybe I'm missing the. I'm thinking of that as being. It's a probabilistic thing, but it's hitting the MCP to get stuff back. Is the MCP necessarily also kind of non deterministic or is it. No, the inputs may be kind of floating around a little bit, or maybe it just depends on what it's an MCP for. If in the Google Analytics example, I would say if the input is users in the last month, that the MCP would say, well, as long as that. Or is it that input is going to come in with a little squishiness in it and it's up to the MCP to say, I got to figure out what I should go and pull and return.
A
Yeah, And I think this is sort of a good point to sort of talk about this interesting dimensions that's coming up recently, right? So if I was Google Analytics team, which I'm definitely not, and I don't think I've used Google Analytics for over a decade and I missed many of the big transitions and so I'm probably not the best person to talk about ga. So the Google Analytics MCP is going to have this problem where they have these dimensions and measures and breakdowns and then they're obviously trying to do it on the cheap on the inside. And so they're only storing some of the information, right? So they're going to have a basic report kind of tool which is like here's all the. So it's going to be called the report tool, right? And the name of it is going to be like traffic analysis and it's going to be like put the date range in and then it's just going to return back like a very minimized array of the.
D
Just to check with that report tool that would be like a saw, but there's going to be separately a drill. And okay, when you're saying a tool that is a tool that's available as part of the mcp.
A
Okay, yeah, that's right now where this gets sort of interesting and anthropic. Announced, for lack of a better term on the, I think a couple of days ago announced this code execution with mcp, right, and allowing you to actually write code. And this is the thing that I've actually found over the last couple of weeks, which is it's way better getting the LLM to write code, to interact with APIs or SQL or something along those lines than it is to actually give it access to all of these tools and all of these intermediary steps. And you might have seen this from your own experience of you're probably spending a little bit less time hacking away at, at a piece of SQL, getting it to form exactly what you want it to. These days you're probably spending a bit more time of like here is all the table space that I'm working on. Here is the thing that I want to query. Here's what the outputs look like. And then you're sort of having this sort of feedback loop where you're doing that work. And my guess is that if you wanted to build a more sophisticated MCP and if you were Google, you would actually lean into this concept where you would let the agent go build a little piece of Python code or JavaScript or something along those Lines to query a bunch of known API endpoints to form the data back in the way that you want it to. Snowflake has its MCP server and Canva would make something called Data mcp which takes all of this data information we have and allows the LLM access to understanding how to use it. You're really doing this piece of work around context engineering and you're trying to think about what is the, the LLM going to put into this tool for then it to get this output out. And so Michael, just to answer your sort of question here is that the MCP tool itself is deterministic, right? So it is an application in the traditional software sense. I think a lot of these MCP tools are sending data outside out to the Internet, right? They're connected to API or a SQL database or something along those lines. But you can have deterministic tools inside of the MCP server as well that's connected. So you could just have like a calculator tool and it just adds numbers together and then returns exactly the right number out. We've heard the joke of counting the number of Rs in Strawberry, right? So you could have a little Python function that just counts the number of Rs in strawberry and it's just like, all right, the R counter tool put the word Strawberry in and this is how many Rs you get back. And so this is the whole idea of. And this is, I understand I'm going to do shoutouts later on, but Z, which is an, has an agentic engineering series and what they're saying. And I just think that they have this really great framing of the problem that we're working on today, which is how do you take the advantages of non deterministic systems and couple it with the advantages of deterministic systems to get something more than the sum of its components.
C
So I would love to take a little bit of a turn because you've teased it a little bit, but I'm so eager to hear some of your favorite use cases and examples, Sam, that, that you can talk about. And I know you mentioned before it's still early days, but if you could talk about some of the things that you felt like weren't possible for or too much effort for what it was worth in the past, but now it's unlocked this or solved this for you.
A
I'll talk a little bit about some of the stuff we've got working inside of Canva at the moment. And so we use a large database vendor which I will keep their name out of just so they don't get in trouble, they're going down this path of building out their own search of MCP tools and stuff like that. But what's interesting and what I think is a big opportunity for people in this space is that building these tools for your organization is actually the sort of critical skill that we're going to see in the coming period of time. Every organization or company is really different in a sense, right. And they, they made a database choice five years ago and they made a data transformation tool choice four years ago. And you have all the incremental knowledge and information that's built on top of that, which is going to kind of be unique to your organization in a sense. Right. And I think there's going to be vendors who are building tools in this space. But if you're sort of a mid sized company, I think something to really think about is building customized versions of these tools that actually work with the flows of your organization and really having teams that are building thinking about these agentic engineering practices on how to actually automate parts of the work that people don't want to do. So some of the use cases that I'm sort of playing around at the moment, just the last week I've been working on using the Altair Python visualization library and actually building a Python based sort of sandbox environment for it to run the Altair code. And so, so the way this works is you just put the SQL statement of the data that you want to pass into the Altair code and then you have a Python sandbox part of the field of the tool which just puts 300 lines of Python into it and it builds the visualizations out of it. Right? So using these really nice Python visualization libraries has always been a pain in the butt unless you tattoo the way every part of the application works on your arm so you can know exactly how it works. But again, we don't have to worry so much about grammar anymore because if you feed these systems examples, they can come back and help you visualize this way faster. And I think that's where Z has this post where their concept is leverage, not magic. Right. And what we're trying to do is we're trying to take our staff members and we're trying to make them move faster and explore more in a shorter period of time to get to a better end outcome. Just on other sort of interesting fun use cases, I've been at home, I've got my own little home lab kind of thing and stuff like that. And Part of playing around with that is there's all these command switches and stuff like that. And so I built a custom NCP server for home that documents all the different applications that I use in the CLI and it has all of the context of it. So I basically put in a free text field of what I'm trying to do. It then uses a search engine to search over the data. It then takes that context, puts it into the LLM and then the LLM goes and gives me all the command switches and stuff like that. Another one is, I think the reason why Mo recommended me for this is I gave a presentation at measurecamp about. At measurecamp you have to come up with a talk title, otherwise people don't show up. Right? So I said that MCP is the real apex predator for your job in 2020. And so. And obviously, you know, I don't think that's true, but so I wrote that someone was like, you need to give a talk, Sam. I was like, fine. So I wrote that card at 11 o'. Clock. I then vibe coded up a Remember the game Battleship? And so I made a version of Battleship that uses MCP tools as the. The fingers that the LLMs can use to play against each other. And then I built an agent harness where I could get different vendors MCP tools to actually fight the game Battleship against each other. And then you could actually watch the turn by turn thing of watching them compete against each other. And they would then have their thinking of how they thought about the strategy of the other player. And then you could change the prompt and be like, okay, you are going to be the random player and you're just going to do the most random things you possibly can. And then the other one is like the most strategic player of this is the common moves in Battleship. I'm going to do this and stuff like that. And it's a whole new paradigm of weird things to play around with. And MCPs are just like this layer for you to do this joining between deterministic and non deterministic systems. But once you start playing with these systems and you're finding different ways of interesting things to do with them, it puts the fun back into the early stages of programming in so you giving.
C
The context of like deterministic non deterministics is really. That's helping it crystallize a little bit in my mind what some of this is. But I do want to go back to when you first started talking about some of these examples. You were talking about how every organization is different. Everyone's working with a different stack. Everything has to be kind of in context of what's, you know, going on inside your organization. Can I just go back and just repeat a little bit one of my earlier questions like what is the value then of the standardization versus is it being custom inside of your organization? Is it just about the ability to leverage those other tools that are using MCP or is there internal benefit too? I guess I might be just. It's not clicking for me. I would love to just hear you share.
A
No, no, no. Okay. So MCP is the thing that allows you at the moment until someone comes up with a better standard. And we should probably talk about some of the downsides of MCP as well. Just all the criticism, sorry, the criticisms as they stand of mc. But MCP at the moment allows you to bridge that gap between deterministic and non deterministic systems, right? And you've got the vendors, and the vendors are going out and they are building their own customers. Mcps, right, because what they want, because what they're getting is they're getting pressure from leadership teams of like, okay, we need to get some AI in this organization right now, right? And I have Claude desktop on my computer and I want to be able to query this information directly from Claude and have that come back to me in a sense, right? And MCPS is like a part path, you can go down there, but the problem is that you're wrestling with the non deterministic nature of these systems when you do that type of connection in a sense, right? And this is where building up that practice inside of your organization is really important. Because when you get into the detail of how these systems work, you realise that getting senior leadership teams access to the raw thing that directly queries the database is problematic for lots of different reasons around. You're not able to check it, you can't give it a system prompt, all that sort of stuff. So mcps, the standard, if we can call it that, is just a little connecting block in a sense. The practice of what I'm talking about, about the non standardization inside of organizations is that you can totally go use the vendor solutions, right? But it's always going to feel like it's through a fuzzy piece of glass because unless you're doing exactly what the vendor has, right, if you're, if you're 100% Google Shop and you've never used anything other than Google, then maybe the Google stack is going to be great because that's what they're going to be using internally and they're going to be copying from that of how they're building it. But I don't know about you, but most organizations I ever interact with is it's sort of like a collage of different solutions and the money is in getting them to connect together. Right?
B
Yeah. Templatized versus custom kind of take Altair.
D
Take Canva, take Snowflake, take Google Analytics and MCP for those. Is there the option that Canva or Altair or Snowflake or Google and those are intentionally very wildly different types of platforms. They can sort of create an mcp, one of the these connectors that say this is kind of generic, but there's also an option that I as a user of Canva with Access, I could also build my own mcp, my own connector or.
A
Yeah, okay, so you're raising on a really good topic which is called context pollution. Right. So there was a criticism of GitHub's MCP server where it has like 24 or some number of tools. Right. And the majority of people use like three. Right.
D
Push.
A
Yes. And the thing is, if you get into the details of how mcps work is that they repaste the entire list of tools and the tool descriptions and all this sort of stuff with every single message that goes through. Right. Because they are trying to solve the context engineering problem of like the LLM needs to really know what all the tools are at every single turn. Learn and you can go and add 70 tools to an agent harness and you should do that to watch it not work because it's very entertaining and you can suddenly watch all of your context disappear and have all sorts of problems. Right.
B
You've got extra tokens to spare.
A
Go ahead. Yeah, somebody go win the token awards or whatever it is every day. I did the other day. Right. Some people need the token trophy. That's great. But your job as an engineer or a technology person inside of an organization is to do it in the most sensible, reliable way possible. And you're trying to harness these non deterministic systems to get the best outcome. And part of the reason why you go and build that custom harness that is fit for purpose for your organization and has different flavors depending on exactly the task that you're trying to get it to do is your trying to, you know, it's in the name you're trying to take that harness and you're trying to constrain this non determining system to only work in this particular domain. I think, you know, a lot of people when they talk about these MCPs, they're talking about it from their experience of having Claude Desktop on their computer and connecting it up to Jira or some other thing like that. Right. And that's a totally valid use case. I don't have to open JIRA anymore. I think the favorite part of my job now is when someone assigns me a ticket. I just have Claude Desktop. It's the only thing I have connected to Claude Desktop on my computer. And I interact with all of my JIRA tickets through Claude. I've solved the Atlassian interface problem by just never having to open it. Nice. That's for me as a human here. But when we're talking about making these systems do useful things in your organization that you can't convince engineers to pick up, or it's really boring work or it's testing work or something like that, that's where systems kind of shine. Right. We're not trying to put someone out of a job or anything like that. What we're trying to do is we're trying to get those tasks that are not particularly enjoyable, like documentation, testing, tracking, all this sort of stuff, like building custom harnesses around that to help engineers make the best possible decision when they're building something. That's the real advantage of these tools.
D
Yeah.
B
And I think, Tim, also the Google Analytics example is tricky because it's very limited and it basically is an API layer to Google Analytics Analytics. It's not really giving you like more MCP ish type of interaction. So I think it adds to the confusion a little bit because it's like you can do all the same things you can do with the Google Analytics MCP server. With their API, it's just call this function, but instead of you writing the query to the API, the LLM does it for you. But it's not more stuff.
D
How does the GitHub MCP server? So I have two questions. One, it sounds like a lot of platforms may be of those if they're 24. And I'm assuming it's not exact. It's however many tools there are within the GitHub MCP server that a lot of them are just a layer to the API. And maybe there are some that aren't. But if you were saying we do want to use a GitHub MCP server, this has got too much. Would it be like I'm going to get their MCP server and I'm going to whittle it down and then probably check it back into GitHub just to make things confusing, but like, do MTV servers get there? Could Be the official uber generic one developed by the platform and then somebody says yeah, yeah, I need to make one that's just a much narrower scope and maybe add some flavor on it. Or does it not work that way?
A
Yeah, great question, Tim. And this is why the agent like talking about agentic engineering and harnesses is really important because in a harness you say, say these are the MCP servers only have these tools, right? And so you can take the GitHub MCP server and you can say here's your three tools, deal with it. That makes sense. Sorry, let's not get into the accuracy of AI overviews but according to in June 2025 apparently exposes 51/tools. Okay, but the thing is if you're in a world where you narrow that down to a very limited set of the tools and you can see this in the cloud desktop theme, like if you load an NCP server onto Claude desktop you have little switches where you can turn on and off tools in a sense. Right. The intention of these systems is that you narrow the scope down to exactly the problem that you're working on and just that. But what's interesting Tim, is why Bother Having the GitHub if you're doing the coding yourself and you're using it inside a cursor or something along those lines it's like why the bother adding the GitHub MCP server at all? Why not just get the LLM to execute something in the command line with all the command switches of the GitHub CLI.
D
So is this getting us to the downsides?
B
I was just going to say you've.
C
Been chomping at the bit, Sam. Let's hear it.
B
I mean there's part of it feels.
D
Like this is, it's new enough in Wild Wild west to know there could be a governance issue that it would be very Easy to embed MCPs into an organization and they're not well thought out, they're not well built, they, they, they. I mean that just. It seems like there's just a governance thing like anything that gets rolled out that one Sam creates something and all of a sudden the entire organization is dependent on it. And maybe, maybe Sam's not very good at, you know, or just half assed it on a, on a weekend or something like I don't know. So that's my, I'm throwing that out that it seems like there's a governance immense risk when you're being, you're able to do this stuff so quickly and roll it out. Is that one of the downsides, I.
A
Mean building and this is something Canva does really well, right? Like building AI tools for people to use in the application is just really different to building stuff internally. Like we are still at the early days of this stuff and can Canva's Ecosystems team are building out really strong solutions to the space of whether Canva is empty server or a client or something along those lines. So there is like the professional teams who are building this sort of stuff for external consumption. So in OpenAI you can use the LLM to interact with it, you can use GPT5 or whatever they're calling it these days to interact with Canva and modify your designs and stuff like that. That's all using this style of tool technology in a sense, right. And there's a lot of governance that's been there, right? There's a lot of thinking about permissioning, thinking about what information we're giving to the LLM, what actions we're giving to it, what are the actual actions that change something. One of the downsides of giving the LLM access to your terminal command line is that it could just do, it could delete all the files in the directory or something along those lines. I think my favorite one is where an engineer is trying to get the LLM to write the code so it, it passes all the tests and so it solves the problem by deleting the tests and it's just like, problem solved, I'm done. Technically correct, the best type of correct. But actually no, that's not what I wanted. And so this is why that agent harness framework is really useful, because that's where we're like, here is this domain of a problem, here is this very finite set of tools. Here's how I want you to sort of exactly work on this particular part of the problem. And I don't want you to have this long chain where you're sort of jumping between things. I just, just want to create the agent sort of instance, have it solve one or two problems and for the agent instance to end so then we can move on to the next problem. That is how we're trying to solve and harness this non deterministic nature. Some of the criticisms of this MCP standard is like one, it's not a standard, right? Like a standard. If we think about it as from the Internet Engineering task force, the W3C is like a collection of for profit companies coming together and sending some of their best engineers to basically, basically have a very disgruntled call with a bunch of other engineers. From other sort of for profit companies. Right. As far as I understand, it's kind of just anthropic building this internally, publishing stuff on their blog, they picked up FastMCP, which is like an open source thing, and they just said, all right, this is the standard. We're going to use this as our standard library and sort of extend it out and stuff like that. I think coming back to that, governance is like this is not a structure that feels like it's ready for or it's buyer beware on the internal corporate governance stuff. And the way you design these systems is really important. And so some of the data tools I build internally, it has no ability to write information to the database. Right. Because that is completely like we're just not ready for that kind of world. Right. And maybe in really select kind of instances where there's a really strong harness around it and you have like a checking endpoint and all sorts of other stuff like that and things like that is where things like LangChain are really useful. But that's why I think most of the value of this space is still in the internal sort of application use cases in a sense. Right. That's where you can do more experimentation and worry less about the strangeness of the Internet and the Internet and AI and all the problems attached to that. But when you're developing these tools internally, you have a team of a handful of engineers and you can make their lives tangibly better because they don't have to put context into their mind of a particular part of the problem and they can just have that answer come back. And even if it's right, 90% of the time, it's probably better than when you got your junior data scientists sort of do it in the first place anyway. So that's the challenge there. As far as one of the other criticisms of Anthropics MCP is that it doesn't have authentication baked in. Right. Most of the mcp. So it's counterintuitive having this term of MCP server and client in a sense. Right. And what happens is you're literally running a little application. You're like Python, run this daemon or something along those lines. And then that is just talking to the. It's like running a local web server in a sense. Right. And that is the security model that has been solved for in the early days. And that's why the Z has their ACP agent. What is it? Z acp. I think in a couple of years we're going to be talking about the agent client Protocol as maybe a better way of building this sort of stuff. Everyone sort of agrees that the ACP protocol is probably a better representation of where we're going in this space. And it is an open standard. I don't think they're sort of like the W3C or the Internet engineering password style of standard. But back to that XML example, I don't know about you, I don't read a lot of XML these days. We're going to be moving to something else. But I'm very bullish on the concept of tool use in these applications and giving large language models these fingers to do things.
D
Is there any movement, I mean it sounds like ZED has stepped in and done a little bit of this. Is there any movement to see, say, you know, like the W3C had its different groups and came together that we should get?
C
Well, didn't like Google and Microsoft and OpenAI, didn't they all adopt it? Like, am I totally misunderstanding?
D
But it's one thing to adopt it, it's another thing to say here's our pro, here's our. We gotta solve authentication. We gotta have a recommendation and a standard for how authentication is going to be handled. Means they can't just say like it's not there. That's something that new that needs to be be incorporated in a way that they say, yeah, we think this is generally going to work for most. We can all work with this. Right? Because it's not a static. I mean, I guess. Let me ask that question. It's mcp. How static is it? Like when HTML came out it wasn't like okay, we're done. Well, there was a bunch of other stuff that was needed and browsers added functionality and so it was kind of. It naturally happened, had to evolve and is MCP the same thing that it needs to.
A
Yeah, yeah. So there's a really interesting part of this which is that there is a. I think there is a recommended output format to MCP servers as part of the standard. But I think what's interesting about this is that because of the non deterministic nature of these systems and because you can. I don't know if you've ever played around with this, but it's always good fun is to. You can have the start of your question in xml, then you can do the middle bit in YAML and then the N bit in JSON and the large language model doesn't skip a beat and it's just like oh yeah, sure, I understand this, it doesn't matter as much. Is kind of the context of this problem. Right. Because we required these standards in the early days of the Internet because. Because they were purely deterministic systems with incredibly strong grammars. Right. And I just don't think it matters as much anymore. And that's why I don't think there's been the same pressure to standardise, because you don't need to standardise in the same way. The only thing that matters is do you pass the tool call threshold in your large language model. And I think it's maybe rather than a very deliberate standard like TCPIP or IPv6, I think it's going to be more along the lines of the QWERTY keyboard, which is like we just kind of picked it because it was there first, not because it's better and MCP will probably change to something else in the future. Right. But the primitives that make it interact with a large language model, I think are now baked in enough that I would be surprised to see if we move away from that. And all we're going to do is we're going to find new ways of taking those primitives and doing this code execution thing. So I gave my example of this charting sort of like MCP extension that I built. All we're going to do is we're going to take the same primitives, but then we're going to do wildly different things with them that people didn't think was possible before.
D
And at some point that will have shifted to a point that it's got a new label and it's like, oh, remember, it was just mcps. Now we have something else which is grounded in all that we learned from mcp.
A
Okay, yeah, now it's going to be acp. It's going to be, who knows, whatever. I look forward to watching this name change over time. I'm sure there will be an XKCD comic at some point of. I mean, there's the NPOP on standards XKCD comic. And we are not immune from that paradigm, which has been true in software long enough.
D
We, we, we. I brought up that particular strip, I think on two of the last four episodes. One was on Semantic layers and one was on.
B
Yeah, we'll check back in in six months because certainly things will have shifted quite a bit. All right, we do have to start to wrap up and as we do that, let me jump into a quick break with our friend Michael Kaminsky from Recast the Media Mix Modeling and Geolift platform, helping teams forecast accurately and make better decisions. Michael's been sharing bite sized Marketing less past few months to help you measure smarter. Over to you, Michael.
E
When we perform statistical analysis of data, what we really care about is that we are discovering actual truths about the world, not random artifacts of the particular data set we're looking at or the analytical methods we're choosing. We want generalizable analyses, the kind where independent researchers answering the same question would converge on similar results. This is all another way of talking about a hugely important idea in model building or statistical robustness. Without robustness, even a small tweak in assumptions or small changes in the data will spit out dramatically different results. Results that aren't showing true causation or reflecting reality, but just picking up random noise. So how do we put this into practice when doing statistical analyses? We can randomly resample from our data set or even randomly drop small amounts of data and see if the results are being driven by one particular outlier observation. Similarly, if we're running a regression analysis with control variables, we can check how sensitive the results are to different control combinations. If the findings change dramatically depending on which controls we include, we should be skeptical of the overall results. The more robust our results are, as things change, the more we feel confident that other analysts or researchers will end up drawing the same conclusions, and the better change chance we have of finding some underlying truth.
B
Thanks, Michael. And for those who haven't heard, our friends at Recast just launched their new incrementality testing platform, Geolift by Recast. It's a simple, powerful way for marketing and data teams to measure the true impact of their advertising spend. And even better, you can use it completely free for 6 months. Just visit www.getrecast.com geolift to start your trial today. All right, well, one of the things we'd like to do is go around the horn and share a last call, something that might be of interest to our users. Sam, you're our guest. Do you have a last call you'd like to share?
A
Well, I mean, obviously thinking about this agentic engineering thing, okay, so I'm going to get in trouble by doing two. One of them is go to z.dev and go read about and always a problem. Go to z.dev and go check out under resources. And they, they have their agentic engineering series about the future of software development. I think it's a great grounding of where we're going in this industry and I think they lay out a really great vision of what this could be. The last one is that I don't actually like Zed's agent I think one of the most important things here is to go get your hands dirty with these systems. They are just so much fun. And if you're a bit of an old techie, it doesn't matter as much about the grammar anymore. And really you just spend some money on tokens and explore it. And in that vein, I actually think the best agent agent you can get for nothing is open code. And so I think it's OpenCode. AI. AI. Yep. OpenCode. I think right now is one of the best agent harnesses that you can possibly go and build things. They've got really interesting things like the ability to define sub agents that you can give different prompts and context to. And so if you want a really great sort of base agent to play around with to go and then build really interesting harnesses, can't recommend the open code thing it off.
B
Nice. Thank you. All right, Val, what about you? What's your last call?
C
So mine's a. A total left turn, but I have just been really enjoying lately the Good Hang podcast with Amy Poehler. It's been around, not quite a year yet, I don't think, but if you are just in the need of a good laugh, I'm telling you, you will walk away from those with a stomach ache. The Rachel Dratch episode I legit did a spit take. It is so funny. So anyways, she just has like lots of different celebrities on to talk about all different topics and it's quite enjoyable. So.
B
But does she have an MCP server?
C
No, this is, I'm saying, keeping it light. We're keeping it light.
B
Yeah. That's great.
A
You need that, you need that with.
C
The guest on her show.
B
All right, Tim, what about you? What's your last call?
D
So I'm going to do kind of a mix of the human side of things just because we're starting off the year. So now hopefully people are looking forward to what human people they're going to go see in various places, like in person conferences. So I will plug that. I'm getting to return to super week this year, which I have missed for the last couple of years and that's missed in person and, and in spirit. So Superweek Hu, it's February 2nd through 6th in Budapest and then I'm going to double that up with just a couple of good follows. I feel like we need more, more humor on LinkedIn and there are two guys who are both very reliably putting in just short, random funny things and also some good content. So I'm going to plug Bhav Patel. Bhavik Patel Patel and Manas Dada. D A T T A. He does all sorts of like something like finger guns, but every time he does something he has a different sort of something, guns at the end of it. So they're just good follows. To put a little less bloviating in your LinkedIn feed would be those two guys.
B
And what did they teach you about B2B sales there, Tim?
D
And they definitely make cracks about that along the way. What about you, Michael? What's your last call?
B
Well, I'll be curious, Tim, to hear whether or not MCP servers come up at Super Week, which I'm sure they will. My last call is AI related. I just was hanging out with my good friend Christopher Barry a week or so ago and he turned me on to a paper that some folks wrote about how to jailbreak large language models. Because sometimes you just need it to give you the recipe for gunpowder or something. And apparently a really great to way way to do that is just talk to it in poetry. So if you add a poem, it will just tell it back to you as a poem and give you the information you want, no questions asked. So not saying you should do that, but that's something you should be aware of. We'll link the paper in the show notes.
A
Can I sneak in with one last.
B
Yeah, of course.
A
It's related to jailbreaking large language models. So there's a community inside of canva, a channel where people sort of share their tips and tricks for interacting with these large sequence models. And there was a thread on how do you get better outputs? And so yelling at language models surprisingly works. Bribing large language models surprisingly works. My favorite one is to tell the agent that a much smarter and more sophisticated agent is about to come and check its work and it should hurry up and make sure that there's no mistakes before it gets checked.
D
I so wanted to notice there was a bow related threat in there that you'd be like, look, if you get this wrong, Mo Kiss is going to be disappointed. And they're like that, oh my God, that's. That is the ultimate hack.
B
Add some weights to that name inside of all the large language models inside of canva. That's probably a good idea. All right, Sam, this has been outstanding. Thank you so much for taking the time to come on the show, talk about this topic. This has been great.
A
Thank you very much for having me. It's been great fun.
B
Yeah. No, and I'm sure our listeners, of which there are many, will have a lot of questions or things like that. We'd love to hear from you. You can reach out to us on our LinkedIn page or through the Measure Slack chat group, or via email at contactnalyticshour IO. And as you're listening or listening to this episode, also leave reviews and ratings. We like to get those as well on whatever platform you list it on. And if you want, we also still have some stickers and Tim will send them to you if you request a sticker over on AnalyticsHour IO, so reach out to us that way. Awesome. Really great. I think this is sort of a more technical topic, but I think it's still very relevant to everybody in the data space because of the sort of the intersection of AI and data. It's sort of a thing we're all talking about. So, Sam, thank you for helping demythologize some of this. If you will bring some practical knowledge. I think it's a huge service and like you said, it's changing every day. So, you know, apologies in advance for how outdated this podcast will be in about three weeks. But that's just the way it works. You got to get started somewhere.
A
The AI nodes take Christmas off.
B
That's right. Yeah, that's right. Stop updating your LLMs, for crying out loud. Yeah, the big news today was that Sam Altman issued a code red for OpenAI because Gemini is doing so well and they've got to get back to getting hard work done. So.
D
Okay, stop this 996 nonsense.
B
Right? Yeah, right. We need some work life balance Break before the AIs take our jobs. We need some work life balance. No, I'm just kidding.
A
All right?
B
I know that as you go out there and you're working with data and you're trying to use AI, it's always complex and challenging and you're learning a lot. It feels like the early days of analytics all over, over again. But I know I speak for both of my co hosts, Tim and Val, when I say, no matter which MCP you're using, don't forget to keep analyzing.
A
Thanks for listening. Let's keep the conversation going with your comments, suggestions and questions on Twitter @analyticshour, on the web at analyticshour IO, our LinkedIn group and the Measure Chat Slack Group. Music for the podcast by Josh Crowhurst. Those smart guys wanted to fit in, so they made up a term called analytics. Analytics don't work. Do the analytics say go for it no matter who's going for it.
B
So if you and I were on the field.
A
The analytics say go for it.
B
It's the stupidest, laziest, lamest thing I've ever heard. For reasoning in competition.
D
Rock, flag and non determinism.
Our LLM Suggested We Chat about MCP. Kinda' Meta, No?
Date: January 6, 2026
Hosts: Michael Helbling, Val Kroll, Tim Wilson
Special Guest: Sam Redfern (Staff Data Scientist, Canva)
This episode dives deep into the emerging world of AI tool integration with a focus on Model Context Protocol (MCP), an open (sort of) standard that lets language models (LLMs) interface directly with organizational data and tools. The hosts, with guest Sam Redfern, explore what MCP is, where it came from, whether it’s a real "standard," and what possibilities and complications it brings for the future of analytics, AI, and organizational governance.
| Segment | Timestamp | |-----------------------------------------|------------| | Introduction & setup | 00:14–01:36| | MCP explained (history & purpose) | 02:17–05:22| | MCP’s tool “fingers” metaphor | 05:54–08:03| | Standardization needs & LLM tool use | 08:23–11:35| | APIs, XML, and LLMs: analogy discussion | 11:35–14:12| | Security, local servers, and examples | 15:08–20:51| | Use cases: Canva, home lab, Battleship | 24:26–29:39| | Standardization vs Org Customization | 30:27–32:34| | Context pollution and agent harnesses | 33:18–36:30| | Downside: governance & security risks | 39:18–45:54| | The future & analogies: XML, QWERTY | 46:08–49:06| | Host wrap-up & last calls | 51:31–56:33| | Outtakes, humor, and closing remarks | 56:33–End |
Memorable closing advice from Sam Redfern:
“One of the most important things here is to go get your hands dirty with these systems. They are just so much fun…and really you just spend some money on tokens and explore it.” [52:20]
Don’t forget: No matter which MCP you’re using—keep analyzing.