Summary8 min read

Podcast Summary

Podcast: Latent Space: The AI Engineer Podcast
Episode: ⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl
Host: Latent.Space
Guest: Malte Ubl, CTO of Vercel
Date: October 31, 2025
Theme: A deep dive into Vercel’s innovations from the Ship AI conference, focusing on AI agents, workflows, open-source frameworks, Python support, and the evolving role of AI engineers.

Episode Overview

This episode recaps Vercel's Ship AI event, with CTO Malte Ubl offering a comprehensive look at Vercel’s latest moves in AI engineering—particularly around AI agents, workflow abstractions, product strategy, and their approach to open source and Python. The discussion explores how Vercel builds for AI engineers, the practical realities of durable workflows, agent-powered developer tools, the challenges and philosophies of framework design, and lessons learned on leadership and organizational transformation in the AI era.

Key Discussion Points & Insights

1. Vercel’s Vision for AI Engineering

[01:16] – [03:19]

Malte Ubl outlines Vercel’s commitment to AI engineering, not just by riding hype, but by building concrete abstractions and products ("We’re the biggest fan of the AI engineering movement... being very concrete about, like, agents are very exciting and you can actually build them." – Malte, 01:16).
The Ship AI conference centered on making agent and workflow development easier with durable, composable abstractions—anchored in building real apps as a way to ground product decisions.

2. The Rise and Rationale of Durable Workflows

[03:19] – [08:06]

The conversation delves into why workflows matter and how they're underappreciated and under-taught, despite their ubiquity in production systems.
- "You don't really learn about this in CS classes. ... It's not really a unit of compute and storage that is taught—it's emergent from Uber and Stripe and everyone else." – Host, 03:19
- Malte details the historical roots and how every serious transactional system (from '70s banks to modern tech giants) reinvents some form of workflow abstraction (“…there either is an explicit abstraction for workflows or someone made one ad hoc because otherwise the thing just doesn’t work.” – Malte, 04:48)
Vercel’s new Workflow Development Kit lets developers express long-running, resumable tasks in idiomatic code, handle human-in-the-loop steps with webhooks, and do this efficiently, open-source and cost-free.
- "You can run compute for infinite amount of time... Automatically retry them, stuff like that. That makes things more reliable." – Malte, 05:17

3. Philosophy of Open Source and Product Design

[06:47] – [08:01]

Malte positions Vercel's open-source strategy as maximizing the collective benefit and reach: "Our strategy is to grow the pie while what we actually see is that our pie piece is a relatively constant size in proportion to the pie... it’s a business model that has more winners than the alternatives." (06:47)
Production Considerations: Even when most users won’t self-host, open-source code gives teams a sense of control, auditability, and optionality (“you want to check the checkbox 100%... I actually do think people will run it themselves and that's great.” – Malte, 08:01).

4. Patterns & Principles Behind Vercel’s AI SDK and Frameworks

[09:26] – [17:47]

AI SDK’s Success: Attributed to humility in abstraction—keeping things low-level and letting patterns emerge from real usage rather than assuming what users want.
- "If you put a very thick abstraction, then it's probably going to be the wrong abstraction. So you have to be humble..." – Malte, 11:44
Contrast with Big Labs: While some AI labs push for high-level, “let-the-model-drive-everything” architectures, Vercel prefers practical, human-first tooling: "We are at the jQuery era. We don’t know what we want yet. We're building all these tools to make the smallest possible things easier." – Host, 13:22
Dogfooding Principle: Vercel never ships an abstraction they haven’t used themselves, ensuring their frameworks are battle-tested (“Dogfooding is ultimately the thing. ...There's this constant feedback loop where if you don't have that...framework builders are usually not application builders.” – Malte, 17:47).

5. Agent Use Cases — Product and Internal

[18:33] – [29:44]

Distinguishing Agent Types: Vercel has agents as product (Agent-as-a-Service, e.g., Vercel Agent for DevOps and observability) and custom internal agents for their own ops/sales/support (“We distinguish between agent as a service and stuff we run internally. They are not the same thing.” – Malte, 18:49).
DevOps/Observability Agent: Anomaly detection triggers the agent, which uses observability queries and log analysis to diagnose issues, acting as a "coworker that has no sleeping problems in the loop" (21:57). This enables aggressive anomaly detection and smarter escalation—“It’s just so much easier than doing it yourself.”
Where Agents Shine & Don’t:
- Agents excel at tedious, repetitive, judgment-light tasks (“Ask your company: what do you hate most about your job? ...These problems are probably easy enough for a current generation agent to handle.” – Malte, 26:15)
- Use cases open-sourced: sales lead qualification, abuse analysis (pre-work for human review), and a structured data analyst agent.
- Agents aren’t yet safe for high-risk actions (e.g., firewall changes, DNS migrations: "I would not let the agent do that yet. Right. It's just too dangerous." – Malte, 25:47)

6. Forward-Deployed AI Engineering & Customer Empowerment

[29:55] – [31:41]

Agent on Every Desk: Vercel’s program offers direct engineering support to help customers build their first AI agents—accelerating adoption and collecting valuable product insights.
- “As a startup, I just want to see the open source project...as a large company, it’s daunting to ship the first agent; so something like a forward deployed engineer does help.” – Malte, 29:55

7. Python Support, Fluid Compute, and Language Agnosticism

[32:13] – [35:26]

Expanding Beyond TypeScript: Vercel now supports Python zero-config for prominent frameworks (Flask, FastAPI), released a Python SDK, and is making investments to ensure parity in experience ("It is also on a Fluid Compute program... you only pay when you have compute." – Malte, 33:25).
No Language Wars: Vercel doesn’t take sides, but recognizes supporting both Python and JavaScript/TypeScript is table stakes for serious AI platforms (“Honestly, obviously I don't really care. I think both communities are very relevant, very large, and we are investing in supporting them.” – Malte, 34:27).

8. Leadership, Org Change, and Secure Agent-Native Infra

[35:26] – [41:05]

CTO Reflections: Malte shares lessons on evolving Vercel through the AI revolution, stressing the importance of "playing to your company's strengths" and building native-feeling AI products (“You have to be honest with yourself. What product...does kind of extend naturally from what I’m doing, rather than ...something that’s entirely different.” – Malte, 36:04)
IC vs. Management Track: Inspired by Google’s promotion ladder, Vercel keeps strong contributors on the IC path rather than forcing management (“You have to be willing to live in this world where you don't make your strongest engineers have the choice of not making more money or becoming a potentially very bad manager.” – Malte, 38:19)
AI for All Builders, Not Just Developers: Designing agent-native infra that’s secure “even if the developer is incompetent” and separates sensitive logic (auth, data-level access) from the AI-generated/modified apps—preparing for a world of code built by designers, PMs and agents (“Assume the developer doesn't know what they're doing and ...they’re using an AI that also doesn't know what they're doing. ...A way to build an app that is secure even if the developer is incompetent.” – Malte, 39:32)

Notable Quotes

On practical AI progress:
“Agents are both extraordinarily effective and still very ineffective. ...You have to find the right problems. And then when you find the right problems, they are super magical..."
– Malte Ubl, 25:57
On workflows and abstraction:
"When I run a bank transaction system in 1975, then I invent this, right? ...There's a version of [workflows] at every single company that has been doing anything in computer since 1950."
– Malte Ubl, 03:46
On humility in product design:
"We know absolutely nothing and we still know absolutely nothing. ...If you put a very thick abstraction, then it's probably going to be the wrong abstraction."
– Malte Ubl, 11:44
On open sourcing critical infra:
"Our strategy is to grow the pie...it’s a business model that has more winners than the alternatives."
– Malte Ubl, 06:47
On AI agent limitations:
"I would not let the agent do that yet....It's just too dangerous."
– Malte Ubl, 25:47
On secure app platforms for the AI era:
"We are very deeply working on a way to build apps that follows the threat model that assumes the developer doesn't know what they're doing and also they're using an AI that also doesn't know what they're doing."
– Malte Ubl, 39:32

Timestamps for Important Segments

| Timestamp | Topic | |-------------|------------------------------------------------| | 01:16 | Vision for Vercel & AI Engineering | | 03:19 | Hidden history and import of workflow systems | | 05:17 | Infinite/lazy compute in Vercel workflows | | 06:47 | Vercel's open source/product philosophy | | 09:26 | Why Vercel’s AI SDK succeeded | | 13:22 | Agent abstraction differences (labs vs frameworks) | | 18:49 | Agent as service vs Internal agent use-cases | | 21:57 | DevOps/Observability Agent explained | | 25:57 | Agents: “Extraordinarily effective, still ineffective” | | 26:15 | Where agents excel (tedious/boring tasks) | | 29:55 | Agent On Every Desk: Forward deployed engineering | | 32:13 | Python zero-config & SDK, Fluid Compute | | 35:26 | CTO/leadership evolution in AI orgs | | 39:32 | Agent-native, secure-by-default app design | | 41:43 | Wrap-up and closing reflections |

Memorable Moments

Dogfooding frameworks: Vercel insists on building real products with their own abstractions, ensuring frameworks aren’t “ivory towers.” (17:47)
Practical AI agent boundaries: Malte recounts giving agents aggressive anomaly detection because “they don’t sleep,” but draws the line at automating DNS migrations (“That would be AGI.” – Host, 25:50)
Embracing AI coding for all: The future is apps (securely) “vibe-coded” by designers, PMs, and agents—not just traditional engineers. (39:32)

Conclusion

This discussion provides a masterclass in pragmatic AI engineering, open-source strategy, and product-driven abstractions. Malte Ubl’s insights illuminate the messy, emergent, and exciting path from “hype” to real agent-powered workflows—grounded in real usage—and lay out how Vercel is evolving to empower all builders in the era of AI.

For more resources, open source code, and detailed show notes, visit latent.space.

Loading summary

Transcript84 lines

[00:03]
A
All right, we are here in the remote studio. Thanks again to f.in for lending us this space with Malta Ubo, who is CTO of Vercel.
[00:11]
B
Welcome. Hey, how's it going? Glad to be here.
[00:12]
A
Did I get it right? Ubo? I've actually never pronounced it out loud until like just now.
[00:15]
B
Yeah, it was completely perfect. It rhymes with Google.
[00:17]
A
Ah, okay. So perfect that you worked on search at Google and Amp and Wiz, which I think still people don't know enough about Wiz.
[00:26]
B
It is like no longer a secret. But yeah, you can't use it. So like unless you work at Google, in which case you probably know what it is. Otherwise there's no reason to really know.
[00:35]
A
Anyway, suffice to say that you are responsible for a lot of the web as it is today. So thank you for spending some time with us. You're also obviously now building the next web, as we say, with Vercel. And we can cover framework defined infrastructure. I think you probably saw I have a lot of interest in self provisioning runtimes. We can cover v0, but here really this part is recorded right after you did ship AI, which we're trying to sort of recap, right, for the general lanes space audience who may not be watching Vercel as closely as I do or you do. So basically just generally what I guess is your message to the broader AI engineer audience on what Vercel is doing with AI.
[01:16]
B
Yeah, I think the super high level view is that what we're really trying to do is we're the biggest fan of the AI engineering movement and we are also fans of. We're not just going super hard on hype and the big ideas and talking about things, but like being very concrete about like, you know, agents are very exciting and you can actually build them. Right. And so like I think our entire conference was about both making that easier, right. And discovering the right abstractions as we're kind of figuring out what people actually want to do. Right. Which is emerging as we speak. Then the way Vercel always does these things is by building things ourselves, right? And so that is both in terms of products, so agents that are products that you can purchase from Vercel and stuff that we do basically in our back office to make our own operations more efficient. And so this kind of building of apps lets us ground kind of what we do in that reality and then, you know, kind of extract the abstractions that we feel are really helpful to then put that on the road. And I think the probably most talked about Thing that we shipped at the conference was our new workflow development kit, which really, really is just a way to make writing like workflows like very idiomatic as something that just becomes kind of first class as something you do every day. You think about it, you know, write 15 design docs just because you want one of them. It's just something you do literally every day. I think like since your audience also more generally like in the thing, probably like listening to what people talk about. I think that's been a lot of talk about our virtual development kit. But also like more generally, what is, what are workflows, what are agents, how are they related? Do you use one or the other? I would actually love to talk about that as well. But like obviously like in our, in our conference, we basically introduce just like what we hope is by far the easiest way to make your, you know, make your agents something that is easily embeddable into complex workflows and to make those workflows durable, zoomable, streamable and so forth.
[03:20]
A
Yeah, I mean, as listeners might know, I have a long history of workflows at temporal. And I think what's weird is a lot of people are discover this for the first time. You don't really learn about this in CS classes. You don't really learn about this in like bootcamps or anything like that. Because it's not really a unit of compute and storage that is taught. It's kind of like emergent from Uber and Stripe and everyone else. I don't know if there's a version of this at Google.
[03:46]
B
I mean there is, there's a version of this at every single company that has been doing anything in computer since 1950. Yeah, but what's not necessarily the case that it has been abstracted in any way, right? But like when I run a bank transaction system in 1975, then I invent this, right? Maybe I'm in pure batch processing world and I kind of avoided it, but the reality is that I built this, right? And so either I use something very productized, which obviously temporal innovated on that being a thing, or I use something that's ad, hoc, right? So I go and say, well, I need to, I need some kind of queue that tracks the work and I need a database to store the state at any given point. And I don't know, maybe I write a con job that makes sure that the stuff on the queue doesn't get stuck, right? And so I think what's extremely common in essentially every transaction processing system that has ever created is that there either is an explicit abstraction for workflows in it or someone made one ad hoc because otherwise the thing just doesn't work.
[04:49]
A
Yeah, yeah, totally. So the headline thing for people who maybe haven't dived into workflows enough is that you can sort of wait and resume code. So it's as though the serverless function is kind of indefinitely long running. Like literally you can run an infinite loop inside of your serverless code and that breaks a lot of people mental model if they don't really understand that the code pauses and resumes and you can wait multiple days and it doesn't matter, it doesn't cost anything. Actually, I don't know if it doesn't cost anything, maybe I don't know if you charge.
[05:17]
B
No, it literally does not cost anything. So yeah, you can run compute for infinite amount of time. You can then also whenever like one of these steps fails, automatically retry them, stuff like that. That makes things more reliable.
[05:32]
A
Yeah, and so like it. It has a lot of parallels to long running orchestration problems for agents. If you want to do human in the loop as well, it's a simple task of waiting for. What's this API that you guys have? It's not, I want to say signals, but you have something like resolve webhook or something.
[05:48]
B
Yeah, I think we. It's a little bit. It's similar to a single but like the idea is that you basically you make a webhook like an ephemeral one which is just a URL that you can ping. And so the realistic flow would be you reach that step where you want human approval. You know, let's say you get the webhook UL and you write it to some database and let's say the user now they log into their computer, two hours later they have a queue of things they need to approve. That's from the database, they click on one, they say approve. Now what just happens is that that system now calls that webhook and then from the perspective of the workflow that you originally implemented, you were now able to await that webhook and now it resolves and you can just proceed with the program.
[06:28]
A
Yeah, it's very elegant. I would say that it eliminates some complexity that we introduce at temporal and that's probably for the better. And the other thing, I think just obviously as someone who is in this space a lot and I've seen all the solutions, you made it open source, which is another above and beyond thing. You could have made it proprietary, but.
[06:47]
B
You didn't Yeah, I think the way we think about Vercelli, I don't want to go too deep into that, that's a tangent. But I think about open source as having essentially three business model. The first one is red Hat, where you just self support, you think it's open source. The second one is open core, where you are the only one that gets to monetize it, but everyone else gets to run it if they want. Right. And Vercel maybe has not invented this, but certainly kind of is the most successful at a model where you say, okay, I have this software library and it's truly open source, everyone can run it. It comes with adapters for every place on the planet and that makes it really popular. And then we get a piece of the pie. And so our strategy is to grow the pie while what we actually see is that our pie piece is a relatively constant size in proportion to the pie. Right. And so we can drive the open source project. And so that's why like, you know, I don't want to say we're like, you know, we're, we're in it for the, for the business model, but like, I think it's a business model that has more winners than the alternatives.
[07:41]
A
I think also something that if people are seriously evaluating for production workloads they care about because, and I ran into this in temporal, like these are going to be extreme, extremely valuable workloads that you're going to put on workflows. And so you want some ownership, you want some auditability in practice, like who's going to actually run it themselves? Probably not, but you want the option, you want the check, you want to.
[08:02]
B
Check the checkbox 100%. Yeah. And I actually do think people will run it themselves and that's great.
[08:07]
A
Awesome. So that's workflows, by the way. I think the most disgusting is the user directives. You know, use cache, use no memo, use nemo, use whatever. So fun. So fun. What's your. I don't know if you ever take on direct in general.
[08:19]
B
I don't. To be honest, I don't feel super strongly. I do find particularly inside of the workflow dev kit. I find the use pretty elegant. I could imagine other ways of doing it. I think we did post a blog post about all the alternatives we considered because there are some that are. That you think about after five minutes and after two hours of thinking about it, you realize, yeah, maybe this isn't such a good way to do it, but there could be other ways of doing it. I think we're working with CC39 to bring decorators into more places, which would kind of make this literally the same. The same thing would happen above the.
[08:55]
A
Function instead of below the function.
[08:57]
B
So it's not a big difference, but it would become, you know, for example, you could make it part like typescript. Be aware of it without a Typescript plugin, which you already provide. Right. And so literally I was wipe coding this thing on Sunday and I said, okay, Claude, you have no idea what news workflow is because it came out on Thursday, but here's the docs. And by the way, you know, put it on my side and yeah, so it did. It installed a typescript plugin for me so I had like the perfect ex and it just worked from scratch. So that, yeah, was great.
[09:26]
A
Okay, awesome. So we can come back to a workflow anytime you want. But I just wanted to keep moving on all the stuff you announced. We should probably also just touch on the isdk. I know you're not as closely involved to that team, but obviously one of the most successful open source projects. I mean, obviously Vercel is very good at frameworks, but I think it was not a given that AI SDK would be a winner because of LangChain, because of Mastra, because of everyone else trying to get its spot. Except that you guys have the perfect package name. So I think that helps a lot.
[09:57]
B
I actually don't. I'm not sure how well how much that helps, but it's great. Like there's a fun background from what people thought AI was for 10 years ago. But yeah, I think, you know, we announced version 6 beta and I think the big, I mean it's not really news because these things are open source and you can follow them very closely. Right. But what it does introduce as a stable feature, because it's already as kind of experimental in ASDK5 is a direct agent abstraction which so far wasn't there. Right. People would build agents with aisdk, but they would have to do it in Bitmo baboons fashion. What I do want to mention is actually because you mentioned it's very successful, which is true. And I think the, the reason why it's successful is because we constrained ourselves to be humble about what we know our users might want to do. The example I like to give, when you build a new web framework in 2025, you know exactly what people are going to do. It's such a well explored space as the person doing it probably has done it three, four times in their life. And Failed and, and learn from that and try the other things. Right. It's so mature. Like, it's the most mature thing. Even 10 years ago. That was also true. Right. That's why Next JS is so good. Because when Guillermo started building it, he knew exactly what to do. He knew exactly what the app would be. Almost nothing has changed. So the AI app space is the absolute opposite. Like, we know absolutely nothing and we still know absolutely nothing. Things are emerging but, like, but it's, but we're so early and so if you put a very thick abstraction, then it's probably going to be the wrong abstraction. So you have to be humble and say, okay, I need to stay low level so that this can be flexibly used as trends emerge.
[11:44]
A
Right.
[11:44]
B
And so that's why we didn't have to rewrite a SDK when everyone went from writing chatbots to writing agents, because we stayed at a level where that, you know, almost looked the same. Just stuff people would on top. And I think that's why, that's why it's successful, because it doesn't, you know, we didn't say, okay, we know what the apps are going to look like. And we go to do this like, Hollywood principle. Don't call us. We call you style framework, where you just have to fill in the blanks. It's super structured, you'll be happy. Right. We didn't do that. Even though that was so in our DNA, we did have to really restrain ourselves. But that's why it's successful, because it's so low level. And so that's why, on the other hand, that's why we don't have an agent abstraction yet. Every other competing library leads with that. Right?
[12:28]
A
Yeah. Like OpenAI's SDK day one, it was like, exactly.
[12:32]
B
Master, et cetera. And you know, I mean, I'm not saying it's bad, but. And obviously that's more accessible right now. You know, you have to understand the agent's tool for the loop. What do I do? I use the stream text function and give it tools.
[12:43]
A
Okay.
[12:44]
B
We added all kinds of like, control already in a SDK version 5, where you can prepare the step, you can select the tools on every loop, you can like do all these things in a pretty advanced fashion. Half of those other frameworks are built on top of AI SDK anyway. Right. Like, so that it forms the basis. And so what we're doing now is we're building, bringing what is emerging as the patterns that people built over and over again as abstraction into the library as the user just kind of are solidifying.
[13:13]
A
Yeah. I have interviewed enough agent framework builders and model and big lab model people that I actually find that I can push back on you.
[13:21]
B
Okay, go ahead.
[13:22]
A
It's really interesting because I think you are saying, basically you're saying we will be at the jQuery where kind of at the jQuery era. Right. Like, we don't know what we want yet. We're building all these tools to make the smallest possible things easier and then it composes up and we're just starting to emerge of agents. I would say that the big lab people are obviously the opposite, but they're not. They're coming at it not from like a DX point of view. They are. They're very big model pills. They want everything to go through the model. The reason they want the Hollywood principle of we'll call you is because they want the model to control the tool, calls the reasoning, what have you. And I feel like there's a mentality, obviously you can have frameworks that do both, but there's a mentality in the big labs, if you work the big labs, that you always want to give the wheel to the model. And then for you guys as framework developers and people who are software builders, it's more comfortable to build like the smallest possible thing instead of like the sort of AGI thing, if that makes sense.
[14:22]
B
Yeah. I actually don't think about it in those dimensions. I think the like, as I 100% agree that people, I think, have to be willing to let go and let the tool. Sorry, let the model kind of take control. Right. And to get emergent behavior. And certainly on coding agents, that works incredibly well. But that I'm totally on point with. Right. And that AI SDK does this very well today. Like, all the agents that I've personally built work like this. That's not the saying. But the other thing is like, how do I now embed this into an application? Right. Like the model apps couldn't care less because they're not really building applications. And so that's something that a company like Vercel thinks about a lot. Like, again, what does the developer actually want to express and how do we let them do it? And so I think one of the key things that people wanted and still want is they want streaming because these models are slow. And so suddenly this almost obscure sub genre of programming where people are like, it's 500 milliseconds, I'm just going to not ship it. Right. Suddenly it's 30 seconds and becomes absolutely Important. And so we gave people the tools to build streaming applications in a way that feels intuitive. Right. And I think that unlocked a lot of value there because that was genuinely hard and we made it easy. And so those are kind of the things that we're looking for that are not obvious when you're kind of mostly concerned about the AI part.
[15:46]
A
Fair enough. And I think the design space has more dimensions than what I try to simplify it down to just one more thing on AI SDK and then we can move on to the other agent stuff. And obviously you guys announce so much. It's so hard to cover. So Vercel is a house of frameworks. Right. You have so many framework authors, all of them legends in their own. Right. What's one philosophy that you're also applying from all your years and all your people who work on frameworks. Right. That is informing you? I have one. And feel free to counter propose, which is what Sebastian Markboger, who is obviously the tech lead of React, used to say, which is have a small API surface area. I feel like that has maybe been not as important or there are other overwhelming priorities. But I just want to get a sense of what governing principles really resonate with you.
[16:36]
B
Yeah. Sep and I are talking about this a lot. I think I'm often representing the kind of enterprise side where it's like, no, but I actually want to just control this.
[16:46]
A
Give me one API for this. Just one more, bro.
[16:49]
B
I want to be in control and I want to be able to configure it. And I want to define the defaults the way I see it. But it's good to have tension around these things. I think the thing that coming down from Guillermo is just the absolute founding principle of Vercel is that we never give you an abstraction that we haven't used ourselves. Dog footing is ultimately the thing. Right. AI SDK was extracted from V0, and then we built it and we kind of diverged a little bit. And then we took on the substantial work to bring back V0 actually fully hosted an AI SDK. And we learned back and we made sure that migration isn't too hard, which the users appreciate as well. And so. And so forth. Right. And so there's this constant feedback loop where if you don't have that, which is like. This sounds like so obvious. Right. But the reality is that framework builders are usually not application builders.
[17:47]
A
Yeah.
[17:48]
B
And so they build ivory towers that when they're hypergenious or they get lucky, they happen to be good. But if you want to do this in a reproducible fashion with a high hit rate, then the only thing you can do is you have to try that stuff out yourself. And that's what we do every day.
[18:05]
A
I really like that principle. Obviously a good idea. It's just very hard to practice in real life, obviously because when you are a maintainer of a framework, a lot of bugs come to you and they pile up and you have to spend some time working on framework level issues. Happy to move on to Vercel agent and maybe the agent on every desk program which, you know, I think you're kind of also championing. So like, yeah, let's talk about the use cases. Like you guys use internal agents within Vercel and what emerged.
[18:34]
B
Yeah, let's structure this two ways because I do think there's a difference between kind of the agents that we're building internally versus the stuff that's, you know, Vercel product. Right, yeah. And which you can, we can use today. We.
[18:48]
A
I thought they were the same thing.
[18:49]
B
No, they're not the same thing. That's actually, I think that's quite, quite important. Like we also, we distinguishing between like agent as a service. Right. And so the Vercel agent, that's what it is. Right. It's ultimately an agent as a service product similar to, you know, codecs in the cloud or the cursor agent. Like not, not as in like it's the same product. Right. But as in like these are things where you go somewhere and you say I would like to use this agent and then I don't know, maybe you give them a credit card and it works. Right. Which is different from the stuff run it early. But let's talk about the Vercel agent for a second. Like we, we've been basically, I think our strategy overall is to have an agent that helps you build applications on Vercel. This is, you know, there is some overlap with coding agents. But like I think the, the thing that's unique about the Vercel situation is that because we have, we have your runtime data, we have, we see your error logs, we know where the preview deployments are, that they always will exist. We know how to start the dev server. We already have the secrets. So there can be like a quite, quite integrated solution for something that otherwise can be quite hard. Right. If you ever onboarded, for example, I mean you have onboarded Devon a few times, but you probably have been in that situation where it feels like onboarding a junior employee. Right. And so some of these things, like if you're within the Vercel ecosystem become much more simple. And so in that world we've been chipping away on different things. Right. A while ago shipped a core review agent which I think is really good and well integrated. And the thing that we announced last week is our broadly DevOps agent, which is actually tied to our anomaly detection system. So whenever there's an anomaly that we detect on your production site, it kicks off the agent and the agent does an investigation of what's going on. From a technical point of view, what this agent has is, has several tools. It can make any observability query against your project. So it's a query builder, it can execute the queries, it has a way to read logs obviously in other with Kepharis as well. So what's really magical is that it's just very good at this. By the time you click on the anomaly it will almost all the time just very precisely tell you what happened. It shows you all the graphs that I looked at. I don't know man, it's just so much easier than doing it yourself. It takes away certainly minutes of work. But I think what I'm actually very excited about and I think this is an overall pattern that, that we see with agents is that there in many situations is this what we insert call recall precision problem. And that also happened with anomaly detection. With anomaly detection you have to tune it, right? And you either tune it to be very aggressive and then it fires and worst case it pages you in the middle of the night and nothing was wrong, right? Just a team in Asia sent a newsletter and so the traffic went up, right? Or you tune it not like aggressive enough. And so you miss events. And with an agent you can just say, okay, I'm actually going to have this tuned very aggressively and I'm not waking anyone up, I'm telling the agent. And the agent can take two minutes to run. I'm actually fine with that because no one would have reacted in that amount of time in a very reliable fashion. And now it can actually look at a time series, can look what happened, it can look at the IP addresses that are making the request, it can look at the type of error messages, right? And I can make a call whether to escalate to on call and wake someone up or to say okay, this is completely fine for someone maybe to take a look next day. Which is I think the perfect decision for agents to make. And so that's something I'm very excited about that you have this essentially coworker that has no sleeping problems in the loop, and they get woken up instead of you.
[22:30]
A
Yeah. I think the dream of AI Sre has been a long time coming and I'm actually on the record. At the start of this year, I made a podcast saying, like, oh, I don't think anyone's going to do it, AI Sre. So I'm very excited. I haven't tried it out personally. I have seen you tweet about it and I think, yes, obviously that is the goal. That is the dream we should have make Brian Johnson happy and have good sleep. But, you know, we're not exactly there yet. And I think, like, the question is really fold, which is time series analysis is not exactly within a distribution for language models. There have been a lot of people doing, like, time series models. There's a lot. There's a deep feel of anomaly detection, which is basically what you're doing. And, you know, there's a question about, like, is this a solved problem or how much can we trust it? And then I think the other one is aligning the human preferences. Right. Sometimes I don't know until I've seen a few examples of like, oh, yeah, this one you should wake me up, the other one you should not. And then pretty much when I solve the problem, it goes away. So the next problem, you're always fighting the last war. In sre. I give you a bunch of things. You can take whatever you want.
[23:34]
B
Yeah, I think that you have to try the product. It works really well. So we don't do the anomaly detection in the LLM. Right. The anomaly detection is a separate part of the.
[23:46]
A
For cell products.
[23:47]
B
It's a completely separate part. It's a pipeline that works on our time series database. Right. And launched independently of this. But once you have this. So our experience is that if you give these agents a tool that does queries, it's really good actually, as digging into individual parts of the time series. And the other thing it has access to is logs. And logs are actually. They're just text. Right. So if you see from the time series, like, what happened, can I somehow figure out what this is? And then you head over to logs and do a deeper dive. Now you're kind of more in the world where the model is comfortable. One thing that we don't do today, but we'll do in the future is that we also give the model X that that particular agent access to your source code so that it can first of all figure out what does the error message mean. Right. And every time you can actually make A PR to just fix it, but every so often that'll be possible. And so it would then also do that.
[24:46]
A
Which is why I think people like Datadog and Sentry are trying to do that obviously because they're observability platforms, but they never own the code. And so they're always limited in what they can do 100%.
[24:58]
B
But also what I want to qualify where I'm so happy about how it works, it's a small part of the overall problem. Right. Like we're actually not here to like build the AI SRE that replaces that job function. Like that's another part of what I feel pretty passionate about, that it's at this moment their like agents are both extraordinarily effective and still very ineffective. And you have to find the right problems. And then when you find the right problems, they are super magical and that if you wander beyond then they don't work. And so that's the magic. Right. And what we see is that getting triggered on an increased error rate works well. And certainly making the decision what to change in the firewall, I would not let the agent do that yet. Right. It's just too dangerous.
[25:47]
A
Or do DNS migrations.
[25:51]
B
Yeah, exactly.
[25:54]
A
That would be AGI. Yeah. It's interesting on all that stuff. Yeah. I think that there's this growing consensus of where agents are doing well and where agents are not. I think for me, meeting notes are solved, simple UI changes are solved. I guess. What else in that list of things that are solved and are reliable every day? What do you put in that bucket?
[26:16]
B
Yeah, I think I had a section in my keynote about this and it boils down to this question. Basically the idea is you go around your company and you ask people what do you hate most about your job? And I really think it finds the sweet spot because it finds problems that are, they're boring because they're tedious and repetitive, but they're, you know, they would have already been automated if they were automatable without an agent. In many cases they often like do require some kind of text like mini judgment, et cetera. Right. So like people, people do these things and so that, that's, I think that the, that question, that, that yields a sweet spot where like these problems are probably easy enough for, for a current generation agent to handle. And they're also often very high business impact because this is actually pretty substantial part of people's jobs. Again, that's why they hate it because it's like takes so long. And so we ended up at our conference Talking about three agents that we built internally, two of which we open sourced. Again so people have a starting point because these are custom agents. Like they're not software as a service things that you just install. So these are custom agents but the first one is one that handles processing of our incoming contact, sales requests, lead qualification. And there are obviously a lot of startups in that space. Right. So I think that's very much in the soft case where you give it a tool for LinkedIn, you give it a more generic tool for Google, give it a bit of an objective and de qualify. What do you care about? Right. You give it a way to analyze. Oh, this is really a support request. Okay, Hand it over to the support team. Right. Like there's a few cases like that, it's not so complicated. So that one is I think is perfect and we open sourced that so people can make their own. The other one that we fall into a similar category is abuse analysis. So we get abuser like reports. And so in this case it's really the agent essentially doing the pre work. So we still have a human person like look at the pre work and then make the decision what happens in the end. But what were they going to do? They were going to go to the reported website, right. They were going to go look at the account and figure out what the age of the account is and they were going to see if they paid their bills and you know, whatever. Right. Like there's a list of things they will do. And so what you can do is you can just make it so that when they eventually look at the tickets it already has all this information. And if the page looked like a Facebook login page, then current day LLM is also able to do that like make the adjustment call and then you just quickly check if it's okay and you move forward.
[28:58]
A
Yeah, amazing. And I think there's one more data analyst agent.
[29:02]
B
Yeah, yeah, exactly. I mean this is also something we just wanted for ourselves, right.
[29:05]
A
Is to do we have one too internally. Yeah.
[29:10]
B
And I think that one's also open source. I think the idea is that, that yeah, you want to ask questions against your data warehouse. And we were very unsatisfied with the current solutions because they ultimately didn't have access to enough information about the data model. And like we are not promising that we have the magical tool that you give you your prompt and it spits out SQL, just access your schema. But we essentially developed just a structured way to document the semantics of your data so that then the agent is Good enough. Right. And so we've been using that internally quite successfully.
[29:44]
A
Amazing. So yeah, for those who don't know know Vercel also has an agent on every desk program where you can sort of reach out. And is it like a forward deployed engineering situation where like you have like a SWAT team that comes in and helps people?
[29:56]
B
Yes, I think that's. But it's also not appropriate for every company. Right. So I think the like my take is that it's. If I'm a large company I have a lot of efficiency to gain but it's also quite daunting to ship my first agent. And so something like a forward deployed engineers which we are doing indeed like does help quite a bit it in that scenario. I think like as a startup I don't want a forward deployed engineer in my office. I just want to see the open source project and feed it to cloud code and then give it my own problem and say build me something like that. But like here's what I want different and that should also be successful. Right. So I think we are kind of with this particular program really going for just unblocking people who feel that they just don't know what to do. They hear the highest. Right. They don't know how to pick the right project. We talked about this like how do you actually find the project that's going to be both successful and high impact? And then secondarily, okay, now that I have the project identified, how do I do it? And I think that their forward deployment is effective. It's something that I, as a framework engineer I've thought about all my life. Where you need to have someone kind of guide you the first time you do something and then the second time maybe you build an agent yourself. So we don't want to stay there. Right. We basically sign contracts with companies saying okay, you have to commit to building three agents and if you do we are going to help you. Like we want to build the first one for you and then the second one we are going to be there essentially by your site. Maybe not literally but like on a phone rotation. Right. And for the third one the assumption is that you actually don't need any help anymore and you can still reach out obviously. But if everything went well. Now this is a company that's empowered to build its own custom agents.
[31:41]
A
Yeah. You're going to be helping your biggest customers and obviously that's going to be leading to a lot of good products ideas. Right. It's kind of dogfooding at scale with the people in the Vercel ecosystem and.
[31:53]
B
Not just Vercel alone, 100%. You just discover things that probably the 500 person startup would not have discovered.
[32:01]
A
I think, I think one last thing just to leave off the whole topic and we can add in anything on the ship AI side. Actually, anything else on ship AI that you really want to cover, get us soapbox about. I don't know if we covered everything.
[32:14]
B
I think one point that we haven't talked about is that Vercel as a company just has been investing in Python quite a while. I think maybe the audience of this podcast might also be excited about this AI SDK, obviously currently is a pure play typescript system. We do find Python really interesting. What we have done over the last weeks is we have shipped like 0 config support on Vercel for all the popular Python framework like Flask and fastapi. Zero config means that you kind of get the Vercel experience where you throw your stuff over the fence and we're going to run it for you, no questions asked. And then just as another thing, like we have for example also shipped a Python SDK for our API just to kind of again show that we are engaging with that ecosystem. It's very obviously something that is in a way new for Vercel, but we have been making hires and infrastructure investments to make Python really well supported on Vercel. It is also on a Fluid Compute program which gets you, for example, active CPU pricing. So you get to run Python in production and you only pay when you have Compute and otherwise it's free, which is very nice if Your backend takes 30 seconds to respond because it's an AI model.
[33:26]
A
I was also going to make the observation that workflows are very nicely meshed. It's almost like there's this fate that you're driving towards workflows that you needed Fluid Compute in order to do all these fancy things or at least make it easy to ship this kind of stuff.
[33:41]
B
Right? I think there's some overlap, right. I think the fluid with workflows, literally nothing's running between steps, right? So it's literally free. That's also why it's free, because there's literally just nothing happening versus Fluid Compute, which does have the same property, except it's more agile. Right? Like usually in a workflow, I don't know, you're doing the FFMPEG thing, then you're doing the AI model, right? Like you're not talking milliseconds and latency. There's some Overhead on each step versus flow compute being like the VM being literally on, which does operate on a different kind of level.
[34:11]
A
Yeah, excellent. And then on the Python thing, what happened to always bet on JavaScript? You know, like, isn't that the Brendan Icke line? Basically, I think, I guess the broader, the non cheeky question is, are we like 5050 python JavaScript now? Is that the future? There's no particular language that will win or do we not have an opinion?
[34:28]
B
I don't have an opinion. I saw today that that like TypeScript is now, as of today is the biggest language on GitHub. Right. Which last year you had to still cope that TypeScript and JavaScript is like drastically bigger than Python. Honestly, obviously I don't really care. I think both communities are very relevant, very large and we are investing in supporting them. Like the way Vercel's infrastructure works. Because we essentially just run VMs, it is not actually hard for us to support Python. We will eventually do PHP and Ruby, but we also I think care a lot about the details. And so essentially the reason why we haven't done it yet is because we do invest substantial amount of time to make the DX actually good and feel native to the ecosystem. And so that's why it's like technically easy but like in practice actually very, very difficult for us to support these things. So we're taking it careful but like we are, you know, there's no technical restriction on our system where we couldn't support all of these different ways of running code.
[35:26]
A
Yeah, totally. I mean that's a very fair response. And the fact is that I think when you're a serious AI cloud, you have to support Python. So there's no way around it. Okay, I wanted to zoom out. One other thing that we care about in AI engineering is AI leadership, which is leadership of AI engineers. And obviously your role as CTO has changed a lot since you joined. What are some, I guess, leadership principles that you've had to, I guess create from first principles? Obviously you're in a very unusual type CTO role role where you're in an infra company, you have frameworks, you have apps. What comes to mind when I say how has the CTO role changed for you?
[36:05]
B
Yeah, so I joined Vercel like a little bit less than three, four years ago and that was definitely, that was before ChatGPT. I mean I came directly from Google and I think I had some insights there of what was happening. But Vercel certainly wasn't living in this world that was kind of the tail end of the last really big crypto wave and, you know, and everything else. Right. And so then suddenly the AI revolution happened. And the thing that I definitely had to work through is like, how do I transform the company into something quite different? And I think the solution that we have come to, like, feels really good. And I think there's a lesson there to be learned for companies that haven't done this transition yet is that you have to do something that feels native to your company. And so the two big bets that we made early on on, one being V0 and one being AI SDK, I think they felt like native to Vercel in the sense that V0 was especially originally designed as a tool for making web pages. Like the full stack stuff came later with the Sonnet models. And originally it was very much a web development tool and AI SDK was a framework for building AI apps. And because we're a framework company, it felt really native. I think you have to make that. You have to be honest with yourself. What product, even if I do have to like change to building something else, does kind of extend naturally from what I'm doing, rather than being very, kind of like being something that's entirely different that no one believes me, that I would be the right place to buy that from.
[37:34]
A
I think that's just generally timeless advice. Have manager to IC ratios gone up? Do you know what I mean?
[37:41]
B
That's a really good question. I don't track that very closely. Think that it's a thesis of. I. Yeah, I would be supportive of it going up. It's a big question. Like the role of the engineering manager. I have two reports.
[37:58]
A
Nice. Nice.
[37:59]
B
Right. So that's the way we're organized.
[38:02]
A
It's a privilege of the CTO that you don't. You know, VPE is usually the people. People manager.
[38:07]
B
Yeah, like, but. But that's actually not. Not the point. The point is that it's like the basic. The really. The one thing that Google really got right is that it has very strong ICs. It has IC levels all the way up.
[38:19]
A
Right.
[38:20]
B
There's level 11. And so we are doing the same thing. And so we had to have someone as an IC at the top level. Right. And so that's why I think this is, I think another thing that you have to be willing to as a company to live in this world where you don't make your strongest engineers have the choice of not making more money or becoming a potentially very bad manager. Right. Because there's little correlation from being the best engineer to being the right manager. Right. You could be. Sometimes you have these moments of oh my God, glad we made this change. But it's essentially crapshoot whether that happens.
[39:00]
A
Yeah, that's the other thing that I find. Obviously Peter principle applies. It's just the principle of promoting people to the level of incompetence. I think the other interesting thing that people are finding is that PMs and designers are now starting to contribute more to code because they feel they can just vibe code something and maybe that's good, maybe that's bad. I don't know if there's standards around this that have been established inside of Vercel where there used to be clear code owners. And now because we feel that we are coding agents, we feel we can do a lot more. And maybe that's causes some politics somewhere.
[39:33]
B
100%. Like we want to be at the avant garde of doing this. Right. Like as makers of V0, we highly encourage all our employees to like contribute code. I think one thing we haven't talked about and we might have time to go into, but like, we are very deeply working on a way to build apps that follows the threat model that assume that the developer doesn't know what they're doing and also they're using an AI that also doesn't know what they're doing. And so I want to be able to build an app that is secure even if the developer is incompetent. Right. But today that's not the case. Right. Today I assume my developer with it, full trust. Right. Like, and, and so, and so we have, I think, very strong progress there. And we're working with large, lots of large companies that have data, for example, where you have the idea that you say, well, Auth, Auth cannot be part of the app because they're, they're not going to get that. Right. Right. So Auth has to be extracted from the app, in fact, which data you can see, they also cannot be under control of the app because again, you're going to get it wrong. Right. So we, we're building these systems that, that try to have a minimum AM independent of the quality of the app. And I think that is part of the future of how people will build stuff. Okay, probably a good idea.
[40:49]
A
Anyway, this is Vercel level, not so much v0 or next JS level. Like an infra.
[40:55]
B
Yeah, in a way, in an integrated fashion. Right. Because I'm going to build the app in V0 and now I want to deploy it to my fellow coworkers, only the right people should be able to access it. And when they access it, they only should see the right data.
[41:05]
A
Yeah, I think this is very exciting. I think like the closest people have come to this is work OS basically. And yeah, I think there should be more agent native, if we can call it that infrastructure that lets people build and vibe code safely which we all want to enable them. They just cannot be trusted. This is very insightful actually. I'm very excited to see that. Yeah, ping me when it comes out. But otherwise thank you for spending the time to recap ship recap all Vercel's stuff. I cannot imagine a better person to talk to. You're so generous and friendly and engaged. I definitely don't feel like all CTOs are as engaged with regular developers as you are.
[41:44]
B
I'm trying my best and I'm spending too much time on X. But yeah, this is super fun. Thank you so much for having me.