
Why do cities struggle to adopt AI at scale despite exponential improvements in the technology? Host Stephen Goldsmith speaks with Boston CIO Santi Garces and Harvard Business School Professor Mitch Weiss to explore the "growing gap" between AI capability and organizational adoption. Hear how the city of Boston improved user satisfaction 3x with an AI-powered web search, why MCP servers are powerful and transparent tools for government, and how to move from pilot to production.
Loading summary
A
From datasmart city solutions the bloomberg center
B
for cities, this is the datasmart citypod.
C
This is Steve Goldsmith, professor of Urban Policy at the Bloomberg center for Cities at Harvard University. Another episode of our DataSmart CityPod. Today we have two returning guests. Sante Garces is CIO of Boston. Mitch Weiss, professor of Harvard Business School, former chief of staff in the city of Boston. Two good friends. Welcome, both of you.
A
Nice to see you, Steve. And Usanti as well.
B
Yeah. Great to be here with both of you.
C
All right, so this is a really exciting podcast. I know that probably hundreds of thousands of people have tuned in because you are two of the world's best experts on the application of AI to cities. And to have both of you at the same time, this will be a recipe for success on the part of Citi. So be sure to be brilliant in the next half hour, please.
A
If you wanted brilliant, Steve, I should have sent my agent instead of me, but here we are nonetheless.
C
Well, speaking of your agent, let's start with this. I mean, you. You came from old school city hall there by eventually the name New Urban Mechanics, and you wrote a book on the future, right? How do you think about the jagged edge of producing change? So just evangelize for a minute. What are the two or three principles in your book that, if applied, would unlock AI generative AI agentic in cities for better services?
A
Well, I think the first one would have to be experimentation. I mean, those of us who have been writing about and studying and practicing innovation and government for the last 20 years have been drumming an experimentation beat. And now that we're in the world of AI, which is built on the world of machine learning, experimentation is absolutely key. You know, building models, then testing those models and then putting them into the real world and seeing whether or not humans and AI can work together at any productive way. Those are things that are, like, completely ripe for experimentation. So the principle of, like, build, test, learn, which we absorbed over the last 20 years, completely, completely applicable in the era of AI. Like, people should be just totally upskilling and up appetiting on experimentation, I would say. I suppose. The other thing is I write a lot about this idea I picked up from Lakhani on sort of innovation at scale and population scale, like doing things that help everybody. And I think the AI moment in cities is one where we don't go, how do we, like, help 10 more people get their permits done a little bit faster? But we go, like, how do we help everybody get their permits done instantly? So I Would say experimentation and scale.
C
So I want to come back to, you know, you spoke to our chiefs of staff group recently. But before I do that, Santi, I don't think anybody is as active in the world as you are in experimenting with agentic or other AI tools. So give us an example of your most successful experiment in the city of Boston and your least successful experiment to
B
kind of touch on something that Mitch just said. This week, we went live with AI search on Boston.gov as the search tool in Boston.gov. so historically, we use content management system to power Boston.gov, boston.gov best government website in the known universe. But still residents, we were measuring because we collect feedback from people that use the search functionality, that 1 in 10 people, only 1 in 10, have a positive experience with search on Boston.gov. so that made us believe that AI and to Mitch were students of Mitch's work. And we put it to practice. We're like, we clearly have to be. Be better than 1 in 10. So we started to think one of the things that's interesting about AI is the possibility of semantic search. So we write a lot of information, but being able to find something, if I type in the word cad, being able to find information about pets, because those are words that are related to each other, I would find information that is relevant to me, but that would not come up in a traditional search. So we started first with one page AI search and one page in the website. Then we went to the top 50 pages. But I think that one big obstacle that I see when it comes to adoption of AI, to Mitch's point, is that people are not thinking. They still think about pilots. But one of the things that you need to be thinking about in a pilot is how would this scale? Because if it's successful, you wanted to make this be the norm. So that's one of the pieces that as we started to scale, we started to think, in the pilot stage, we need to collect more data about how is this performing so that we get to the point of deciding, are we going to replace what we're doing with something new. So we collected tens of thousands of data points of people giving us feedback, and we ended up finding out that we had positive satisfaction 34% of the time, which is still not great, but it's much better than the 10% satisfaction that we had with the traditional search. And we realized that it costed us about the same amount of money to have both. So we said, there's not a lot of things that make government work. 3.4 times better. Let's just switch it. I'll tell you, there's been some unexpected lessons as we switch it, because obviously what we realize is we've made a search experience that is better for constituents. But we've also, after we went live, we started to hear a lot from our colleagues in City Hall. They're like, hey, I use this all the time. And the search terms that I was used to searching are not working. And we're now at a place where obviously we're working to try to make these things better, but we had an asset that is designed for public consumption that was unoptimal for the public, but it is really optimal. If you knew what you were looking for, it was really easy for you to find that thing. So now we're in this point in time where we're actually starting to talk about how is it that we give access to our employees. We're addressing some of the issues, but should we have a separate product and a separate search experience for our employees to be able to find that information? Because we keep finding out that government, a lot of times optimizes our own experience at the expense of the experience of the public. And obviously that's not good. And there's a number of different examples where I think that the most successful examples of AI projects are the ones where we're able to translate and we're able to bring an experience that focuses on how they experience government and not what we want to do.
C
One could think about incremental improvements in service from a bot that answers a single question more efficiently. And if I worked in an agency in Boston City Hall, I might be thinking, as a dedicated public servant, how can I make what I do better? But the truly disruptive opportunity is the kind of one Mitch mentioned. Mitch, you mentioned in your beginning, which is we're not just trying to make it easier for one person to get a permit. We're trying to transform the way people get permits generally. That's cross agency thinking. So if you want to cause adoption in City hall in the most disruptively positive way, how would you do that?
A
Well, I would start by not using the word disruptive, probably. It's a good question. And we see, you know, struggles with AI adoption, all sorts of institutions, not just public sector ones. I mean, it is this. There's this giant riddle. My colleagues call it the growing gap, which is, if you look over the last four or five years, it's been this exponential increase in the capabilities of AI. You know, you could measure it by Parameter size of the model. You could say just look at task length. I mean what you can ask an AI to do and just do on its own has just grown in just leaps and bounds. And yet organizationally, like I don't think most of us in most organizations, unless we in some AI native startup or a handful of other organizations that have really transformed but go, oh, this has fundamentally lifted things up. It's more, been a more linear, been a more linear curve. And so why do we have this technology capacity but only this organizational absorption is like the riddle of our era. I think there are a couple of reasons for that in government. One of them is people's real ethical and safety concerns. They're like not sure which side of this they want to be on and they're not sure where this is all headed. And so I think the way to deal with that concern, Steve, is to be straightforward with people and not pretend like the thing has zero risk whatsoever, or not walk in and say this is the best thing ever, don't worry about it. I think it's much more honest and much more helpful to explain and understand the opportunities, but absorb and sort of acknowledge people's worries and risks. So I think that's the first thing I would do, is just be honest and open with people, have enough room for all the various views. The second thing I would say is, and Santiago, don't want to give too much away about the work that we've been doing on the work they did in Boston, but it turns out if you put these tools in people's hands, a lot of their, let's say a lot of their doubts maybe diminish. And so actually getting the tools in people's hands I think is a super duper important thing. That would be the, that'd be the core layer to be sort of intellectually open and honest, but practically say, hey, why don't you actually, why don't you use this stuff? And if you look at the number of people still throughout government who don't really have safe ways of using these tools or directed way of using these tools, it's quite substantial. My guess would be that we've over indexed on training. Like we can't give it to people until we bring them through some really steep training and we've under indexed on putting the tools in their hands to invite them to good use.
C
When you spoke about AI or agentic to our 40 largest cities convening, how did you tantalize them in a way that caused them to want to increase adoption? How can they accelerate the utilization of tools which will improve services?
A
One thing is, it goes back to your first thing, which is you tell them you could become 10 times more productive or 100 times more productive. Like there's a real like huge opportunity here with, with agentic architecture. Like there is really a chance that we can do things we just, we've been trying to do for years, like make government work better. I think the other thing was maybe some demystification. I belong to the Andrew Ning school of thought, which is it's not helpful to say agent or not agent and it's much more helpful to say more agentic. And so what I did with that group, we were with Steve when you were there, was to say, let's not get hung up on whether the thing I'm building or doing and thinking is an agent. But let's actually take the AI stuff we've been doing and, and say can we make it more agentic? Like on some dimension, could we give it more execution capacity? Could we give it more learning capacity? Could we give it more memory? Could we give it more collaboration capacity? Could we give it more sensing capacity? Let's not imagine we have to create some archetypal agent, but just say take what we're doing and make it more agentic. I think that invited people to feel like they could really engage on that front.
C
Santi, I haven't had a conversation with you in a month where MCP wasn't in each sentence. So how do MCP servers generally, and specifically in this last Mitch answer, how is that going to accelerate utilization and adoption?
B
So I think that maybe I would start by explaining to people what an MCP server is and why we are excited about, about using the MCP stands for Model Context Protocol and it's a recent innovation. Basically it's a system that acts as an intermediary between an AI system and then a resource. It could be an API exposed system like the open data portal or a permitting system or the system to request recreation assets like to do reservations. It's particularly useful because one of the challenges with using AI tools is that they can be unreliable. It's so dependent, like we've seen some data and evidence that depending on how people prompt, they get different responses. An MCP server allows, once it's triggered like in an action, it allows the behavior of the tool to be more reliable and to follow like a particular pattern that in our case what we're interested in is like we think that it could be the kind of solution that enables people to interact with the city using these tools, but in a way that makes us feel like they're going to get more consistency, that it is going to be more secure, and then we would feel more comfortable about the possibility of limiting some of these tools that might be a little bit less secure from accessing our resources. We've been providing access to AI tools in Boston for over two years. And what we see is that there's this, like, geometric distribution. Some people use it all the time, other people use it every once in a while, and the majority of people that have access to it don't really use it very frequently. And that matches with a lot of data from other sectors as well, besides the public sector. And to Mitch's point, we generally have thought, well, we just need to train people on how to use these skills. But if you flip it, we can start asking ourselves, what can we do that would make these tools more useful faster and more reliably, where you have to be less reliant on making sure that people were following the right instructions and having the tools just behave automatically because they were engineered to work better. So one of the things that we've been doing with you, and thanks to the collaboration of Beth Novick at the Burn center at Northeastern, was building the first, to our knowledge, MCP server. And we started with open data to be able to answer questions that would lead to improvements in innovative approaches to improve the performance of government services. But what does that mean? So if I wanted to see what open data sets might help me ask a question right now, like, you would have to go to the open data portal and you ask questions. And again, our data sets are usually coded based on what we think from an administrative standpoint makes sense. So we'll probably find a lot of data sets about pwd, which stands for the Public Works department. But you'd need to know that the nice thing about the MCP server is that it allows us to use the API of the open data portal to start searching for data sets that might be relevant to ask that question. And the second thing that you might be able to do is be able to give it the skills to use the same API, well, a slightly different API to be able to gather information about each data set. What are the fields? What. What does each field mean? Which might help you get more insight about, like, what would be a good data set to help you answer a question? Then the third thing that might do is be able to actually query data from the open data portal to be able to answer these questions. So We've started to use these tools to be able to do things that are complex, be able to join information from multiple data sets, to be able to answer bigger questions. I'll give you this spoiler of what I just was able to do when we start layering multiple MCP servers. But then you can add other pieces of infrastructure that can get you to more agentic behaviors. In my case of agentic behavior, I'm not yet quite at the point of. Mitch, you have this thing that fully does something for you, but I would like something that helps augment my skill and reduce the amount of work that I have to do for it to behave in a way that makes sense. I started to build, using Claude skills, this agentic toolkit of how could I get my MCP server to give me better quality answers to things that I'm interested in. I was able to get it to use frameworks from MIT, from the Poverty Action Lab at MIT, from the Bloomberg center, from GovLab, to be able to answer questions using the best practice frameworks for public innovation, using my MCP server to gather data and then to present results, like suggestions on how to improve performance on city services.
A
Can I. Can I add one thing to that, Steve? Because what something Sandhi said about skills is super important and super important for the public sector and soon important for the public sector and people like us who've been, you know, wishing and wanting good people to share their stuff across other organizations for the last several decades. What Sande can do now is essentially make his skills as a, as you know, my favorite CIO in the country available so that other people's. Other people and other people's agents can also use those skills. So we've always been looking for, like, some way to scale what's going on in government and take great people and scale them and great ideas and scale them. We actually, with this, with these, literally, they're called skills. We actually have these ways now of making Santi available or Santi's, you know, accumulated knowledge and wherewithal available to all the other CIOs in the world right now, which is sort of astounding if you really think about it.
C
Yeah.
B
And to that, just to show like, what this means is like, to Mitch's point, there's a lot of work that usually happens to be able to answer like a complex question with data and to innovate. I'm able to be able to do the kind of thing that I would do by just saying, hey, how could we improve the performance of the city in addressing snow. And then it will automatically go do. And all of the kinds of things that I do. Temporal analysis, differences by neighborhood response time in. Across different years. Anyway, so that's. That's the part that is really cool. Anyone can ask a basic question, but doing all of the things that are hard about the analysis, that's the stuff that you could automate.
C
Well, so many opportunities. Mitch, let me come back for a second about, I don't know, maybe two years ago, you did a session with chief performance officers and chief administrative officers of eight of the 10 largest cities in the U.S. and you had a slide that had Personas on it, which is actually how I've been thinking about this, because part of the challenge is finding the answer, but a bigger part of the problem is finding the question. Right, so how could we use your slide? Right. Personas. I'm the transportation director. I'm the sanitation director. I'm the deputy sanitation director to improve the quality of the inquiry that would lead to more preemptive solutions, better causation, more responsiveness.
A
Yeah, it's an interesting question. I mean, so when you say Personas, I had sort of anthropomorphized a bunch of AI agents. And so if I had an AI agent, that was the Snowplow Scheduler, you know, now we called, you know, it the Snowplow Scheduler. And so we gave it that perfect personality. And I still think it's an open question, to be honest, whether that's the right metaphor. Like, in some ways, it's useful to us as people because we, we relate to other workers as people. And in fact, some of the AI companies are trying to get us to do this. You know, OpenAI calls their agents at work. Now they're. They're having us call them co workers. Their new. Their new platform for managing agents calls them AI co workers. Not even agents, but they're called co workers. And so it may be that the sort of personalization and almost anthropomorphization of this is helpful to us, but I'm not actually sure that metaphor survives much longer, to be honest. I mean, if you look at some of the things have been making waves last couple weeks, these agents on social media with other agents talking to each other, it's not clear that we should think about these things like other individuals, their algorithms or models. So anyways, so but if we thought of them as people or not people, but co workers, I suppose, Steve, to answer your question, you could absolutely construct. Good question asking co workers. I mean, you could Construct AI agents that were in charge of trying to help you think about the nature of the problem. And in fact, there are some papers that show that you and your AI could be quite good together at the problem definition stage or the ideation stage. They suggest that where you and your AI might fail together is actually when it gets to execution, the collaboration between your, you and your AI and execution is tougher. So I suppose we could, we could instantiate them as sort of question askers, you know, problem raisers, and think of them that way. If we didn't think of them as coworkers, but thought of them as models or algorithms or agents, I think I have to think about a more statistical way. But I still think there they surface, they could surface basically anomalies and stuff like that.
C
But if they were truly agentic, couldn't they help me explore possibilities, contingencies, hypotheses much more broadly than I could do on my own?
A
Oh yeah. I mean, and you could also loop it almost infinitely. But you could say, hey, be my unintended consequences agent. Government would always come up with new policies. And then, you know, six months later we go, it's an unintended consequence. It's like, couldn't we see that happening? You could just have your unintended consequences agent, like a, like a thing on your shoulder. And it could just constantly, at every meeting go here, you know, have you thought about the untangle consequences? Here are the system effects. And you could just, it could be constantly there and you could keep going back and forth with it. So absolutely, you could create vehicles like that. You could have them sit in on meetings.
C
I mean, for sure, I love the answer. Back in the old days, in the last century when I was mayor of Indianapolis, I used to ask somebody to sit in on the meetings and represent all the people I forgot to invite to the meeting.
B
Right.
C
All the people who had a stake who weren't at the meeting. So you could have this little agent say, you represent the following six constituencies that are underrepresented at the meeting. Tell me their views at the end of the meeting at least.
A
Yeah, I think one architectural question, not to get too technical about it, but one thing we want to study is whether if you decide to do that, it's in fact better to have one agent who's kind of the missing stakeholder agent, or whether you actually want to create six agents each in a narrow sense that represent those stakeholders and have them be there. That's like a empirical question for what is worth.
B
My sense is narrower Agents are easier to evaluate. So it's easier to go and see if we're able to get. To some extent, I think about agents as mathematical functions giving a set of inputs, what are the outputs that it produces and what are the ranges of those. So it is easier to chain smaller things that you know are reliable and then figuring out how to make it. It's kind of like a principle of system. So whatever. That's my take on the answer.
C
It's a good conversation. Let me ask just a couple more questions. Santi, you're outsourcing your coding pretty quickly. How much of it will go out? What's the effect on SaaS purchasing of software? Are you going to be the country's leading developer of everything? How do you look at the future here?
B
I think that there's a really exciting time because again, generally speaking, there's a lot of work in the government technology space and things have changed substantially over the past 15 years when I've been CIO. But I think that we're at the dawn of a really exciting time. Generally speaking, we have been deprived. There's not a lot of technology because the market is limited. There's limited amount of people that go into building technology for governments. But there are things that are transformative. One of our developers, I seem like we don't have too many developers in city hall, told me, hey, I think that some of these tools are making me twice as productive as I was before. So like, that's an area like coding and some of data analysis are areas where we know that AI does produce really significant gains. And the question is, like, what are we able to do that we haven't been able to do in a really long time or ever? Because we just have such a large demand of things that we're responsible for. I'll tell you what we're doing about it. One, we're excited and we continue to work with our teams to figure out what are the right set of tools. So some of our teams are starting to look at tools like replit or BOLD to build higher fidelity prototypes. Like build things that look like the thing that needs to be built, even if it's not the thing that's going to go into production. But it helps our engineers, the people that are actually going to build it, know what it should look like and how it should behave. There's people that are developing that are using tools for development and that again, has incredible approaches. One thing that we find that is needed, that we're working really hard to invest in, is to make this successful. You need good systems design systems where these AI tools are bringing components that can be reused over and over. Because if you're building more, you also want to make sure that you're maintaining security and you're making it easier for accessibility and all sorts of stuff. The state of New York has an MCP server so that their developers can access their web design system, which is like this set of components that they have. And I think that that's just really brilliant and, and the way to go. And then the last piece that I think is really an enabler is good architectural practice. You want to be idiosyncratic about how these systems need to be configured. Because all of a sudden, part of what is exciting and terrifying is what happens if everybody in city hall could become a coder? What if anyone could build their own web application? For me as cio, I think, well, then our role becomes how do we enable for people to be able to deploy these things? But we would only want them to build them in certain ways that we know that are secure, reliable, scalable, that we can maintain. So we don't have kind of like a sprawl of all sorts of little applications that then are harder to manage.
A
Just one thing to add here, which Sandhi, you have to close your ears maybe, but people are focused a lot on what it possibly means for, like, efficiency, you know, in terms of Santi's developers can produce a lot more code. Or what you said was like, hey, maybe you don't need to buy that SaaS product. You can just make it yourself. But don't lose sight also of the potential benefit here for resiliency. You might have a tool go down and you can make your own. Sadi said you may not want it at first pass, thousands of tools. You're not sure of the cyber protection, all the rest. But cities have focused for years and years and years on resiliency. Think about what it means now that any city or anybody in any city, if they need a piece of technology in some situation where maybe they had it and it was now deprived, or maybe they never had it, they can spin it up quickly for resiliency. This could be quite powerful.
C
Let me, let me just push on this for a second, Mitch. So the three of us probably been exposed to more young software entrepreneurs than most other folks, right? Because there's. They find their way to us and they're in the.
A
I would say not compared to like venture capitalists in Silicon Valley.
C
But no, no, no, not the people's money, no, in Government in government, you know, so our academic. But there's a little bit of anxiety on procurement about buying cool stuff from cool people without having any idea whether that'll survive six months or eight months or whatever. Should your last answer affect a city's appetite for risk in the acquisition of AI empowered software?
A
Not that much because I think one of the big risks if you're and I'm all for, you know, startup entrepreneurs helping us in this space, so I don't want to be discouraging of that at all. But I think the biggest risk if you're going with some new vendor you don't know enough about is what's happening with your data. And I don't think my last answer implicates that that much. Like I would want if I was exploring some new startup selling some new AI thing. I want to be super clear about what's happening the data, not only because I want to know it's not leaking, but also if it's going to be a source of our advantage. I don't really want to give it away either to some company. So I think the resiliency voice still stands, but I don't think it really suggests like oh then it's fine to go willy nilly with whatever startup because you can always make your own thing if there's doesn't work. I don't know what Santi thinks about that.
B
Yeah, I think again we benefit when there's more competition and options for solving problems. I think that what is exciting about this time is the barrier to entry is lower. So we hope that it means that we're buying better quality, lower costs, higher performing systems to some extent also because we can go and build our own alternatives. But then we get the choice of deciding, you know, like if we've built a procurement tool, we can deploy it with our own tools or we could go and refactor it. But I think that that is where you just have to be mindful about vendor lock in like are you building things in a way in which you can go on switch and change your answer later? And I think that that is a thing to keep in mind around like how how you're choosing to architect, how you're choosing to deploy. And again for me the piece that is interesting is continuing to be grounded in what are the human problems that we're trying to solve with the technology. It doesn't matter if it's AI or no one buys a tool because it is built in. Net or JavaScript or C. That doesn't matter. Does it work? Is it easy to use? Is it reliable? Is it secure? Are the kinds of things. I think that the same thing applies with AI.
C
All of us who are aware of your reputation know that you accomplish a lot on a day to day basis, right? Really practical experimentation to start with. Mitch's. So if, if you're talking to the other hundred cities who may listen to this podcast who want to be like you, what one piece of advice do you have?
B
I want to be like the other people. I keep learning from them, but it's just trying to find that balance of like trying things. Like, I think that following Mitch's framework of embracing the possibility, right? It's like There's a type 2 error in government. There are things that we should be doing that we're too afraid to try. We should be trying to do things. So to try, you don't need to know how it's going to work, if it works yet. You should start thinking about it. If it looks like it's starting to work, for sure. And then on the flip side, I think that's the other problem. It's like we should be moving beyond pilots. If we think that something works, we should be figuring out how is it that we put this in production. I think that the recent conversation that we had with this group of chief data officers, chief Performance officers, this past week with you, Steve, makes me think, Leanne, sometimes if we've built a tool that enables more people to access our open data in ways that are meaningful, we spend hundreds of thousands of dollars, if not millions of dollars with all of the infrastructure and investments to be able to collect the data and make it public. And if it costed us $20,000 or $50,000 to make it more accessible and useful, sometimes I think like we get stuck in there and then we're, we don't make that investment. It's like, that's crazy to me. Like public things are there to be used and if we find ways of making them useful in ways that are reliable and secure, like we should. So it's like I think that we get a little bit sidetracked and then not willing to commit the resources to solve the actual problem at scale.
C
Great answer. And then, Mitch, you're not in the Kennedy school government, you're in the business school. I mean, you have a government background, but you're in the business school, which I think is a terrifically helpful perspective. So if you looked at large bureaucratic private organizations that have some of the same problems in terms of customer service, do you have any lessons? Anybody you'd want to call out? Boston's got a really talented mayor and she's set a culture that allows people like Santi to do stuff. But if you look to the private sector, is there an example we should be paying attention to?
A
I appreciate the sentiment, your question, but I never like conceding that the public sector should learn from the private sector. It can go both ways, and I don't like putting one over the other. But if it's just about, look, there's lots of big organizations out there struggling with the same riddle, like, how do we actually absorb the technology productively and transform ourselves? I would say that some of the lessons are around leadership. Like, you do actually need people from the top who beat the drum that, like, we have got to change. The world's changing around us, and we have to change so that we're ready for our residents and all the rest. If you look around, you'll see leadership matters. I think you'll see that a focus on measurement actually matters. Evaluation, like return, matters. And I think the best organizations are monitoring that and trying to make sure that they're getting the most of this because it. It builds a steamroll. I think that what you'd find is, in some cases, those that were data ready in the first place is a very apropos for this podcast. But those that were data ready in the first place have been able to take advantage much swifter because these tools leverage the data lakes and all the rest. They don't need it to be all in one place, but it helps if it's not all in 77 different places. So I would say leadership evaluation and some foundations like good data, good data investments, those things will help.
C
I wouldn't want to concede that one sector is better than the other. Private sector maybe feel an urgency to change more quickly just because of the, you know, the customers who have a choice of cereal are a little bit different than the customers who have to move out of Boston in order to
A
improve services, then here's one last thing to add. If you want to harness competition in an age of AI, nation states are going to be pitted against each other, I think, more than ever before. And so governments, for governments to be successful and resilient and for their populace, systems to be supported is going to be more important than ever. So government's going to need to move swiftly. They're not without competition. They distinctly have it.
C
Great answer. And then, Santi, when. If somebody listening to the podcast has an extra billion, we should invest in MCP servers that compare performance across large cities so residents can create a little internal competition. We'll do that next. This has been great. You two are terribly insightful and great contributors to improving the quality of public services. My thanks to Mitch Weiss and Sante Garsnas for your time today.
A
Always great to see you. Steve Santi Likewise. If you like this podcast, please Visit us at datasmartcities.org and find us on itunes, Spotify, or wherever you get your podcasts. This podcast was hosted by Stephen Goldsmith and produced by me, Betsy Gardner. Thanks for listening.
Title: Agentic AI Comes to City Hall
Date: March 11, 2026
Host: Stephen Goldsmith (Bloomberg Center for Cities, Harvard University)
Guests:
This episode explores the arrival and impact of agentic AI in municipal government, focusing on tangible experiments and philosophical shifts needed to unlock AI’s full potential in cities. The discussion ranges from successful (and less successful) pilot projects in Boston, to broader questions about adoption, organizational capacity, technical architecture, and the interplay between the public and private sectors.
Experimentation is Essential
“Now that we're in the world of AI, which is built on the world of machine learning, experimentation is absolutely key.” (01:45 – Mitch Weiss)
Thinking at Scale, Not Just Pilots
“The AI moment in cities is one where...we don’t go, how do we, like, help 10 more people get their permits done a little bit faster? But we go, like, how do we help everybody get their permits done instantly?” (02:45 – Mitch Weiss)
AI Semantic Search at Boston.gov
“There’s not a lot of things that make government work 3.4 times better. Let’s just switch it.” (05:18 – Santi Garces)
Scaling Beyond Pilots
The "Growing Gap": Tech Capacity vs. Organizational Absorption
“Why do we have this technology capacity but only this organizational absorption is like the riddle of our era.” (08:33 – Mitch Weiss)
Addressing Concerns—Openness, Safety, and Experimentation
“We’ve over indexed on training...and we’ve under indexed on putting the tools in their hands to invite them to good use.” (09:43 – Mitch Weiss)
Defining Agentic AI
“Let’s not imagine we have to create some archetypal agent, but just say take what we're doing and make it more agentic.” (10:59 – Mitch Weiss)
MCP Servers: Unlocking Consistency and Security
“An MCP server allows...the behavior of the tool to be more reliable and to follow a particular pattern…more consistency, more secure.” (12:09 – Santi Garces)
“We actually have these ways now of making Santi available or Santi's...knowledge...available to all the other CIOs in the world.” (16:36 – Mitch Weiss)
From Agents to Co-Workers
“Some of the AI companies are trying to get us to do this…OpenAI calls their agents at work...co workers.” (19:09 – Mitch Weiss)
Agentic AI for Stakeholder Representation
Democratization of Coding
“What happens if everybody in city hall could become a coder?...Our role becomes how do we enable for people to be able to deploy these things?” (24:48 – Santi Garces)
Resiliency as a Hidden Superpower
“Think about what it means now that...anybody in any city, if they need a piece of technology...they can spin it up quickly for resiliency.” (26:20 – Mitch Weiss)
AI Procurement Risks
“The biggest risk if you're going with some new vendor you don't know enough about is what's happening with your data.” (27:32 – Mitch Weiss)
Advice for Other Cities
“There's a type 2 error in government. There are things that we should be doing that we're too afraid to try. We should be trying to do things.” (29:54 – Santi Garces)
Public vs. Private Sector Lessons
“Those that were data ready in the first place have been able to take advantage much swifter...” (32:09 – Mitch Weiss)
On Agentic AI Philosophy:
“Let’s actually take the AI stuff we've been doing and say can we make it more agentic?”
(10:57 – Mitch Weiss)
On Unexpected Impacts of AI Pilots:
“We’ve made a search experience that is better for constituents. But...we started to hear a lot from our colleagues in City Hall...the search terms that I was used to searching are not working.”
(05:38 – Santi Garces)
On Democratizing Development:
“What happens if everybody in city hall could become a coder?...Our role becomes how do we enable for people to be able to deploy these things?”
(24:48 – Santi Garces)
On Leadership and Urgency:
“You do actually need people from the top who beat the drum that, like, we have got to change.”
(32:13 – Mitch Weiss)
On Using AI for Unintended Consequences:
“You could just have your unintended consequences agent, like a thing on your shoulder...at every meeting go here, you know, have you thought about the consequences?”
(21:09 – Mitch Weiss)
This episode offers a rich, candid exploration of how agentic AI is transforming city governance, both in pragmatic detail and philosophical orientation. The conversation moves from tangible advances in Boston to broader lessons on risk, leadership, experimentation, and the future architecture of public services—emphasizing that the true power of AI lies in scaling benefits to all residents, not just incremental departmental wins.