
Loading summary
A
All right, we are here for a very Special edition of LaneSpace with my buddy Jared Palmer. SVP at GitHub and VP at Core AI at Microsoft.
B
Correct dual title.
A
Yeah, Twice the fun. Is it weird to have two jobs? No, I'm only on.
B
I'm only full disclaimer. I'm only on day 13, I think. Yeah. So early days. So. So far so good.
A
So far so good. We've been trying to get you on the podcast for two years, I think.
B
I think so, yeah.
A
You. Yeah, you're a busy guy.
B
We let her do it in person.
A
So we have to do in person. Yeah, exactly. It's way better. I should also plug that. You have if Jared Palmer fans should dig into your previous podcasts with Ken Miller. How about that? From the interim. Okay, so shout out to Ken. Shout out to Ken. Before that you were building, I guess like V0 and AI SDK and you were just sort of VP of AI at Vercel.
B
Yes. All AI initiatives and vibes. Yeah.
A
And I feel like basically you went from like you sort of building one coding agent to now being the home for all coding agents. Is that like the general vibe of Agent hq?
B
I think that's right, yeah. So backing up, I spent the last sort of two years or so building V0 at Vercel and AISDK. And then the summer took time off and now joined GitHub and today we launched Agent HQ, among other things, here at Universe. And yeah, it's going to be the home, we hope, of not only agents, but also developers. And it seems like the gravity well of this new collaboration space that we're trying to build.
A
Yeah. What do you think? Like, basically that GitHub can do that you couldn't do at v0?
B
GitHub is an enormous platform. Right. So these are 183 million.
A
It's some amount for.
B
Yeah, it's under 8 million developers. Um, it's just the scale is immense. Right. And V0 was focused on not only one language, but one framework. Right. And a specific problem space with a built in renderer. And you know, you know, for those who are not aware of V0, it's like a bolt or lovable, but it's built by Vercel and it's focused on building Next JS apps, specifically Next JS apps. That constraint was rather liberating for the team at the time and lets us really like laser focus. You're well done it, I hope. Thank you. I hope so. And obviously at GitHub, you know, we're the Home of all languages and, you know, and frameworks and developers and so the scope is broadened and yeah, it's just a different part of the map, if you will. Right, yeah.
A
How I've seen. So you've been basically covering the entire journey of coding agents, you know, from the start. Like, what do you think? What's your personal journey through coding agents?
B
Right.
A
Like we started out with copilot. Obviously GitHub started the co pilot trend.
B
Sure, sure.
A
When tell us about the origin story of V0 and then how that developed and maybe where, where like what you want to see next with history.
B
It's funny you asked that. As I've told this story multiple times, I feel like I've unlocked different parts of it in my brain by going back, you know, like, so maybe we'll have to figure out how retrieval never works.
A
By the way, interesting how memory works for agents.
B
Totally. That's why I brought it up. Like kind of thing as you sometimes you discover new paths.
A
Right, yeah.
B
Anyway, the story goes like this. So when ChatGPT first came out, obviously it was incredible, right? Like world changing immediately, faster growing product ever. I looked back at like the timeline and dates and we were very early, like in when I was at Purcell jumping into AI stuff. But the journey kind of went like this. So at the time, actually there was no AI division, there was no AI group. I. I was actually the director of engineering for all of Vercel frameworks and I was helping next JS Svelte. Svelte, right? Yeah. So next JS Svelte TurboRepo, TurboPack, Webpack, and all internal dev tools at Vercel. And I was helping the Next JS team dog Food and test the initial implementation of server actions. And instead of building a to do app, Guillermo, the CEO of Vercel, was like, why don't you build like a playground? And I was like, okay, cool. So that led to the AI playground, which is now just part of AI SDK. We'll get there in a second.
A
Which by the way, iconic for like side by side, but also the.
B
Right. So NAT C. So I'll dev. NAT Dev. So G told me that nat day of I got a dm. It was like, I remember because I was at a bachelor party and Guillermo, totally online, sends me a note, like, nat.dev is launching on Monday, you have to ship. And I had been working on it previously and so it was like, same idea. Yeah. So he got wind of it, I guess. So I. I definitely had to like jump into motion and I didn't think we didn't even ship. Chat first sends me this, this DM over the weekend, I'm at a master party and he's like, nat.dev this sidebuy. He sends me the link and I play with it. I'm like, okay. So I spring into gear. Ship the AI Playground. What was cool about the AI Playground was it forced me to go through every single model provider's API docs and figure out their, their quirks of their nuanced streaming. Because at the time, it wasn't like everybody used OpenAI. It was like, oh, little quirks. Some of them kind of were compatible. So that was my first Foraya and then launched AI Playground. That shot to the top of hacker news. And I remember we didn't. I didn't even implement Chat because that chat wasn't actually like a. Like wasn't as important. It was just like confusion completions. So eventually we back to chat and that, that project. Out of that came aisdk. Because I had already looked at all the model providers and all the combinations. I was like, okay, here's that chunk of streaming code you need. And then AISTK found that sort of niche of like, how do we focus on the part that we're going to be good at, which is like that UI aspect of it, but then also not get in your way. So that we shipped AI Playground, then AI SDK launched, and then, you know, we're always about demos and having great starter templates at Vercel. And I remember writing Guillermo, I was like, you know what? Be cool. This guy, Shad cn, oh man, he seems like, amazing. And his UI library is doing great. Why don't we team up and ship a ChatGPT clone open source. And we did. We shipped this, this, this awesome template, which is now called Chat SDK, but it's great. And what that did, though, at Vercel was it set us up for, like, rapid experimentation because we had this really good, like, Pretty Full featured ChatGPT Ready to Rock with all the latest features. So when it came to like, rapid prototyping that summer, now we're summer 23, it was so great. It was like liberating. So I remember at that point I had gained some momentum internally and pivoted almost entirely to AI. And I had shuding if you're friends with. And Max Leiter and Shad CN now were cooking. And I think at that point, code execution had just come out. I think that's my timeline.
A
Yeah, the cool sandbox interpreter, Interco interpreter.
B
That's what they call it at the time. And I had a very. As soon as I saw this, I had a very ambitious idea and proposal to present to Guillermo, which was like, what if there was some. And mind you, tool calls don't exist. The context window is 4,000 tokens. So like, there's not much here. What if we had this thing where like you could prompt and sometimes we would do code interpretation and then maybe we could sort of. Sometimes we would do code interpretation, but then other times it would choose to render like UI or then it would render sometimes like a document or in.
A
Line in the chat.
B
Yeah. And like it would just have different.
A
Sort of generative ui.
B
Yeah. And maybe you could pipe them together so like the output of one could pipe into. So if we did code interpretation, we course it's always emit like tabular data. Maybe we could pass that to another prompt that would like a UI and just like some idea there. It's kind of crazy, but if it sounds like these are just tool calls, that's exactly what these are, right?
A
It was, yeah.
B
So it became pretty obvious that like the. So I came to Jeremiah and we, at the time, we just had a secure. We had a sort of security debate, like, should we. Code interpretation with the ability to fetch data, like giving it Internet access was like kind of. Now they're like, fine, whatever. Whatever you. Whatever you want. Wow. Wow, Wes. But at the time it was like a little scary. So we kind of said, okay, no to the code interpretation, but this UI thing, it's pretty neat. And so that this like prompt to UI that was like the aha moment of V0, but the models were not very good. Right. Or relative to where they are now.
A
It was four GPT4 era.
B
This is just into the GPT4 era. And now we're probably at a 16,000 token context window. So you can't really do chat. So we had to kind of invent this kind of new paradigm of like fake it with completion. But that forced us to do sort of the click. The. The initial V0 which launched, I think in September 23rd, it would look more mid journey. In fact, if you go back to the original tweet, it was like mid journey for react because it was all very visual and you could click on different components and elements and reprompt. But it was again, we were kind of hacking this because we didn't have chat and we didn't have tool calls. And then fast forward again, you know, that launches and then probably like nine months later it took us like nine months to get to like a million ARR this little team. But then the models progressed and from you know, GPT4 GPT4 32K the big boy. We never really got GPT4 turbo working. I don't know why should never happened. And then switched to other frontier models and, and then started doing our own models and stuff like that. But Fast forward another 10 months or nine months or so and then we rebased towards chat and now the models finally could do chat and the artifact pattern had evolved so it was time to rewrite. When we launched V0, the chat version or the new V0 or whatever you want to call it, it's like 14 days, another million MRR. 14 days another million MRR. It was like a rocket ship after that. And that just proceeded and we just kept cooking and so that's been the journey. We just kept perfecting and what was really liberating for us was actually the focus on just one stack or one framework. When everybody else was trying to do general purpose coding agent we were like no, we're just going to focus on Next JS front end and Chad cn And that was really. That allowed the team to focus. So that's the story arc.
A
I mean to be fair, like Bolt Love Ball, like because Next JS is so dominant basically everyone has to be good at Next js.
B
Right?
A
But being focused on like even right down to the UI library and component stuff like that, that actually helps a lot.
B
We also started working with all the frontier model labs to help because it was in our Vercel's best interest to have them be created Next js. Yeah, and also because of the post trained models and you can read about this on the Vercel blog. Like we can the post training like harness that we created, we already started sharing and stuff with other model labs and stuff like that and we had all our data in a very hygienic state work with them.
A
So did you ever debate internally and because you know I've I my from my seat at cognition I can also see this where you should pick the best qualities of every model and string them together in the V0 or you have the model selector and you let customers choose.
B
We went back and forth and I think yeah, you know, we launched, we went back and forth on all this. I think at the end of the day there's pros and cons. One of the benefits of having your own branded model or synthetic or composite is that you can stitch these things together. Yeah, there's like at higher levels of. And now it's a little different with. With this agentic flow. But even look at what you like what you guys launched recently with rep. Right. So search is going to be a different model than what generation? You know, the Genesis. But like. Because it's sort of. But like search and Genesis are two different like entire subsystems. Right. So you can have search evals that are gonna be totally different. And so how do you. So where we ended up now we. Where it probably ended up now is like for a long time we didn't have Model selector and then we had our own models which were composites which we talked about and then like. And that would allow us to you know, mix match and I think that's probably if what. It's also nice because you get as a. This is like the product hat you get to brand it.
A
Yes.
B
Right. And you can decouple it from the launch of the Frontier Lab.
A
Yes.
B
How do you guys. How does Cognition even bill for it? Like apus, some synthetic unit. Right.
A
Yeah.
B
So it gets a little wonky.
A
Yeah.
B
We can go on for pricing this. This stuff. It gets challenging. But the nice thing about having the like brand name model is that like you get to co launch with the provider and they'll hype you up. So. But your billing needs to then is capped at whatever retail is. Right. Or some.
A
Right, right, right. Or you can't really charge too much of a premium. You can but.
B
Right. But then people are API key. Right. And it's like well then how do we charge you for sweet rep or something right afterwards.
A
And so like I think it's some part of it is the cynical like you want to create a sustainable business and independence from the model labs. But. But the other part is genuinely you actually do get better performance. You like string together all these things.
B
Yes. And so it's tough. I think. Well, we've switching gears to GitHub. Like we are all about model choice now and making sure that. And what's cool is that we also have Copilot which is our harness and Copilot cli. But we also have third party partners like Cloud Code and Codex and Cognition now like in Agent hq. So you kind of get the best of both worlds and like I think that's going to be awesome. And ultimately what people want.
A
Yeah. I think also the model layer is not the right abstraction to do the switcher anymore. Which is weird because that's where you started with the isdk.
B
Yeah, exactly.
A
But now it's like the model and the agent have to be strictly tied together, like very, very strongly balanced. You can't loosely bound it and just do a generic interface because then you're just going to have the lowest common denominator of all the models.
B
If you're in agent world, which may just be better than chat world in.
A
General, Agent world is a much better.
B
I'm calling it Agent World. But I mean by like a loop with maybe compute runtime and like files.
A
So that's that your definition of agent. You're dropping your official definition here?
B
No, don't put me on stuff. But maybe. No. So my, my initial definition of agent for, for AI because like I actually, I was dying on this hill because aisdk everyone else was an agent framework and I think maybe they, they actually went to like this. I don't know what it says on the front page now, but like, who knows? But like six, an agent is, you know, an agent is orchestrating, you know, an API request with and for loop. Okay. But a coding agent now has meant so much more. There's like these, you know, coding agent SDKs and you've got sandboxing and file systems and tool calls. And. And I do think that is a uniquely. I'll call that Agent World and chargeback coding agents here. And yeah, I think that that seems to be where things are going. And even I believe the, the Claude Excel agent is basically. I was talking to. To Mike Krieger backstage. Like, I think it's like related to Claude code.
A
It could be. I actually don't know how it should implement it under.
B
Oh, it is.
A
Yeah.
B
It wouldn't surprise me if it was.
A
Yeah. Yeah. They seem very all in on Skills, which is kind of like an interesting.
B
What do you think of Skills?
A
It's kind of dxt, which is like the sort of bundled version of mcps. Okay. Yeah. The reason you don't know about it is because it wasn't very popular.
B
Okay.
A
So Skills is kind of like the second shot that is very LLM pilled. It's like just read my markdown and just read this directory of files and go nuts. As long as I can understand that you have the capability to run code, to read files, you're good. And actually that is the universal interface, which is a file system.
B
Right back to Agent files.
A
Yeah. Which is kind of cool. Yeah. So I mean, I think what you're hitting at is this philosophy of our understanding of what coding agents. The minimum bar is over the last two years. Yeah. Right. Like you've lived this journey and like now you're basically kind of like the king maker or like the. I don't know about that. You run Agent HQ and I mean I imagine you have other projects too but like Agent SQ is like the big one that we're talking about here. Like what are you seeing from the different agents? Like what do you want to this to become?
B
Such a good question. I think that Agent HQ and GitHub itself needs to co evolve and one of the things that Microsoft has done really well is by putting things that are alike closer together. And so you think about the new core AI organization got Visual Studio, Visual Studio Code, GitHub and parts of Azure all in one and obviously the GitHub team and the VS code team have been working closely together for a long time but now we're really close together. And I think for me one of the cooler things that AgentHQ can sort of offer is this seamlessness, the spluidity with your workflow. Right. So if you saw in the demo today, we saw a demonstration of you use Agent hq, you fire off a task and it creates a pr, but you can also open that PR up in VS code in one click and that's awesome. And I think the vision for GitHub as it evolves is to look at those touch points where AI can be sprinkled in Salt Bay style into the native workflow. Whether you're assigning an issue or maybe some new stuff that I think we should focus on. Maybe it could be like how do we resolve a merge conflict? Oh my God, right, Like how do we maybe pop open an action or like get in. Right. And so like I think this is.
A
My definition of AGI totally.
B
But like you get that error on an action and you're like we've all been in that, we've all been in that sort of flow where like actions kind of don't work locally versus that tool act if you are trying. I don't have that set up on my machine. I haven't done this in a while. So you're pushing up and you're, you're this like okay, what if we could just put like you know, comment or kick off a task dev, solve this for you or throw things there. I think what I'm trying to describe is this like this workflow where it's just like seamless and fluid and you can stay in a flow state across whether you're like across all devices, mobile web on GitHub.com or in your local editor. And I think that's where, you know, my focus is going to be in the next six months or so.
A
Yeah, yeah. Just a side tangent on this. So one of the things that Microsoft also owns, I don't know if It's Microsoft or GitHub, is Dev Containers. And I think, like, very important concepts for sandboxing environments. Whatever you call it, it is kind of a light version of what Docker containers are kind of.
B
Right.
A
Do you see that as a standard that we should invest in as a thing? Because it's supported in VS code? I don't think it's just that popular outside of VS code.
B
Yeah, it's used internally at GitHub too, for, like development. At GitHub?
A
Oh, yeah, yeah.
B
Which is cool. Yeah. I think they were so far ahead, almost like it was. But now there's like sandboxes. There's so many of these days. Right. So I think cloudflare just launched theirs. There's Daytona there, Vercel Modal, which I think lovable uses.
A
I have no idea.
B
Yeah, I mean, you probably have your own. I don't know. What do you. What do you guys.
A
Just some Kubernetes pods.
B
Okay. You guys are rolling it yourself.
A
Very cool.
B
I think that's maybe the runtime, but there's work and discussion about what that runtime should be, even internally at Microsoft. And we've got a couple different competing things, so we'll figure it out in the next, you know, in the next cycle here. But there's a great point, like there's a lot of cool stuff that is in a dev container. You got VS code loaded, you've got a file system, you've got a sandbox, you've got the security protocol, but also like wired into GitHub Enterprise and ready to be packaged. So there's lots of goodness there.
A
Yeah, I see the number one pain points that Cognition has, but also Codex also presumably the other guys is repo setup, which is effectively what dev containers and a Docker file does for you is like, run this thing, then that thing, set this up, do that thing. Why is it so hard? Like, why haven't we solved it?
B
I don't know. I think it's hard because you can't predict what's in the repo. Right. So it's like. And you don't know when they bundled FFmpeg, you just don't know.
A
Right. It's nice. And like, if it's just Next js you just run PNPM install.
B
Correct. Correct. You, you can like do special. Like there's obviously through constraints you can make optimizations and I think the general purpose container is just like challenging. That being said though, I think there's probably some work to do on you know, auto detection and preempting and stuff that could be done there. But it's just a bigger, it's a broader problem space. Right.
A
Yeah.
B
And yeah, so, so fun fact.
A
When I was at netlify, I actually wanted to reach out to Vercel to do like a standardized open source auto detection thing of frameworks.
B
Oh yeah.
A
And like we, we never, we never really got internal momentum on that. It was an idea. I was like, shouldn't this be open source? You know. Yeah, like auto detection is a, is a common utility that everyone needs.
B
Yes, yes, I remember that. I'm having a flashback.
A
That's like half air. Yeah. Everyone builds their.
B
Yes.
A
Then yeah, probably we shouldn't all build it.
B
Right. No, it's. And then also like what are your defaults? They're not exactly the same. Which would be better to like just having even the same like preference stack of defaults is the right. Would be great because then we can move the whole ecosystem together from like to pmp. Right.
A
Okay. So are there other movements or protocols or standards that you're interested in? Like MCP was a big winner this year. There's other like, I don't know, AC, A2A ACP. All this, all these background.
B
Interesting. I'm not as familiar with ACP.
A
The payments or one or the. Is that red one?
B
Oh no, the Z1.
A
The one. Okay.
B
And then the payment, that was a Stripe or coinbase. Stripe. Yeah. That's very cool.
A
We've had them on the pod.
B
Okay. Yeah, that's very cool. It'd be interesting to see if that takes off.
A
I mean it's stripe.
B
Yeah. But it still needs to be adopted by like the clients. Right. And yeah, and, and I think that's. That's fascinating. The MCP is huge. It seems like it is the way that a lot of the. Especially when it comes to like digital transformation or some of our enterprise customers, it's where they're going to. Able to be at or where they are able to add context. In addition to that, we also have custom agents that we announced today too. So like you can work with prompts and stuff within your, within your agent HQ and customize these agents for different tasks and Those can have MCPs and such and I think that's really powerful from like a platform perspective. Yeah, gets me excited. That's what I think Is like shipping now and in the next, but we're always on the lookout for like the next thing. And I don't know what's on your, what's, what's top of mind for you.
A
For like standards or standard Standards. Container. Is container a container? Look, I think Dev container just as a PR problem. It's a great idea.
B
Right, Right.
A
Just no one makes it interesting. I think you can do it. Basically this one.
B
Okay.
A
Added to my list. But before that probably you have a bunch of other stuff that I do want to get to. But just staying on the AI stuff, like I think we're actively exploring computer use as a thing because it kind of got going a little bit. People were very excited and then they found out it was slow and bad and inaccurate.
B
It is computationally intensive in my understanding.
A
It's getting better. Yeah. Especially with open vision models like Deep Seq, OCR and OMO ocr. Just give it a few more turns of the scaling.
B
It seems like it's like you need that edge case and primary. Just see how it's like a modality worth pursuing.
A
Yeah, I think a lot of people are on the code gen side of code agent side. A lot of people are trying to think about. All right. We had this evolution from copilot to a more energetic cloud code situation is where I think that the status is like what's next?
B
Right.
A
What's the obvious next step?
B
Making them good.
A
Making them good. Yeah. You don't like look.
B
No, no, I know it's more just like, you know, the devil's in the details. Like going from 90% going to like hill climbing, it gets steeper in my opinion. It gets steeper. And so going from 90% success to 95 to 98 to 99 to nines of success. Yeah. I mean really hard.
A
Paying Mercore a lot of money for expert programmers of open source maintainers to.
B
Like, you know, then you realize along the way maybe the users aren't that good at it.
A
But like.
B
Yeah, no, but I just think there's a lot of work to do to finish the swing and there is a big difference between 98% and 99% correct. And that's like noticeable. And this used to hit, you know, if you're working on AI product, you probably don't realize how you've probably seen this. Like most people are blind, like living in la la land about how poor quality their, their AI product lately is unless they're really measuring like the number of error free sessions, like how many errors are coming from the info infra providers like you know how many requests are dropped, how fast those fast these things are. And so that's something that we cared about it for sale quite a bit and like we'll care about.
A
Do you have like a daily review of do a dashboard?
B
I don't know.
A
Daily. Daily slow. Yeah. Okay. I thought you were going to say D too much days.
B
No. And even thinking us like every three hours roll up of. Of key metrics and stats.
A
Yeah.
B
And like one of them was like era free sessions and other things like that that was like really important because you know especially now with agents which are like multi turn. I have a tweet about this that was like in 2024 which is that like agents will really only work when we get to not only like the more intelligent models but better reliability of the infrastructure providers. Right. These aren't, these are not inference is not like a database like update uptime. Right. So there's still differences between providers, there's still differences between performance and difference uptimes. And that's why you see things like open router being very successful and different gateway products because like reliability you need to switch, they go down all the time. So long story short. Yeah we would, we would do like you know it was almost like video game style. Like we'd have like all the data coming in all the time.
A
Yeah.
B
And that allowed I used to joke is by mood ring like good day, bad day. So it was very successful for us. I think other teams should adopt that like data driven approach.
A
I think one thing that's surprising is the lack, the relative lack I still see on data analyst agents where you can sort of chat it like add a slack bot for the precise analytics that you want to generate. Because I think we're still in the BI era.
B
Yeah.
A
Isn't that weird?
B
Yeah, I totally agree. It's interesting that that space hasn't been like captured as much like I guess maybe now actually I'm interested in this like shift to a way into. In like into knowledge work tasks with coding agents.
A
Okay.
B
I wonder using coding agents for non coding tests. Correct.
A
Do you, do you do that personally?
B
Yeah, yeah.
A
What do you do?
B
Well like I was this summer I was doing like I was helping. I was trying to automate some my dad's workflows and stuff like that and just like some of his he's got some Excel spreadsheets and for like accounting like like financial accounting or managerial accounting I guess. And yeah just like point cloud code of that stuff and see what happens. And like it's. It ends up doing Python and generating some scripts and it kind of got off down on the hairs, but it was like even he saw that it was better at it than the chat client.
A
That makes ass super obvious.
B
It became kind of obvious. Yeah, it felt better.
A
I wonder if he can try Cloud for Excel and see if.
B
Yeah, yeah, you got me. Right.
A
The.
B
And then of course, you got the browser. The. I don't even want not browser. Browser space agents, but not. Not computer use, but browsers with agents. Agent browsers. Agent browsers.
A
More flexity.
B
Everybody's coming, right? That guy. And it's like, is that the better? If that's true, then maybe the general purpose injection point is there. Yeah. What do you. Which. Have you tried any of the agent browsers?
A
All of them.
B
All of which I am very.
A
I'm currently maining Atlas mostly because I just want to give ChatGPT a fair go.
B
Okay.
A
But I'm very stuck to ARC and like the vertical tabs.
B
Oh, okay.
A
I think any pro user. Like, I have multiple businesses.
B
How many tabs you.
A
I'm context switching.
B
Right.
A
Hundreds of tabs open. I made an open source tool called Chrome Dump. You can find it on my GitHub where it literally dumps all the tabs open. It summarizes them, and I can close it by deleting them on Markdown.
B
That's pretty cool. So you just go on like, you just go on like a. Like a. Like a. A bender and then you just dump it.
A
Yeah. It should be as easy to close as Markdown and Chrome is. Isn't that good at the performance side of things yet.
B
And you were working on like some browser comparisons.
A
I was, but so I tried to build it in Tori. Okay. Tori explicitly doesn't want you to build a browser and I. I care to fight it too much.
B
I see. Yeah. I think. Very cool.
A
So just to wrap things up and we're around. About time. There are other side projects, tasks and things that you've announced here. First of all, redesigned GitHub homepage, which a lot of people don't even know. GitHub has a homepage.
B
I'm legitimately one of the tweet, which is Riz's tweet, like printed out. Like, you see the tweet. Like there's a tweet from Riz and it's like no one uses. All of this stuff is totally useless. I'll pull it up. Like I quoted today because we were like, when we launch. Let me get it right. Because I got to do it. Right. Hold on.
A
Okay.
B
Was incredible how pretty much the entire GitHub homepage is useless and is 1.3 million views and 19,000 likes. And this was 5-2-25. So perhaps the team. The team made improvements and today they launched a new get up homepage.
A
Yeah.
B
Which I'm very proud of and they should be really proud of. It's got tasks at this top, it's got recent PRs. Some stuff is still there, like your recent repositories. I still. I think there's more work to do, but it's like really overhauled and they did an amazing job with it, so they nailed it. But, you know, more work to do, never done. And like, hopefully we can keep iterating with the community and everyone and keep going.
A
The last thing I want to hit you on is Stack Divs.
B
Oh yeah.
A
You asked everyone when you joined, what should I work on or something.
B
Yeah.
A
What happened? I don't know if this is your job specifically.
B
It wasn't.
A
But why do people want stacked dips so much? I think you have some history there. Yes. Anyone who's interacted with anyone at Facebook knows about Fabricator.
B
We're just about it. Yeah.
A
So can you explain why it's been so, like, what it is? Why is it so hard?
B
Okay, so this, this concept of a pull request, which we're all familiar with, you write some commits, you open a PR and then you merge the PR and you go about your day. So as you scale like larger organizations and like you look at your history and, and, and there are people who are like very, I'll say, like, have like near religious beliefs about how to get. How did you get right. Rebase versus rebase versus merch. There's. There's a crowd that wants to fast forward the repository so to preserve all the history. And then there's a crowd that wants to squash and, and, and merge into the.
A
I mean, squash.
B
Right. Anyway, Facebook and I've. I've never worked Facebook, but in my previous startup before Vercel turborepo, I did a lot of research on build systems and at Facebook they have. Not only they have their custom build tools called Buck, they also have a custom file system and they don't use git, they use Mercurial, which. And then now it's sort of custom and it's all wired together. And at Facebook they don't use pull requests. They have a different sort of philosophy. You can. It's sort of like the best way to think about this is like Imagine every PR just had one commit in it. You could branch them and re. And the critical thing is you can restack them and then if you restack or make a change later, like earlier in the stack than later in the stack. And these stacks are just diffs, right? The commits are just dis. And that's the term stack diffs. You can then collapse them and merge the last one and merge them towards it. And it just gives you a little bit of a nicer workflow. And it's what people. If you work on a monorepo or you work on a very, very large code base, it's a really, really nice way to work. Especially you've got a system that will re. That will automatically restack. And then if you think even more deeply about it like, and get really deeper into the weeds, you can decide which diffs in the stack CI should run against. If you get fancy, okay, may not.
A
Like start a commit message or something.
B
They're just always. Yeah, you could decide like maybe this one doesn't need it or skip that one or whatever. And you end up getting these like sort of these groups, these, these stacks. And it's really nice from a code review perspective because when you go to update or you can update different part of the stack, it just makes it a little bit more fluid and so it's what people want. There are a couple tools out there in the market that do this kind of behavior. One's called Graphite. There are a couple others who made tools called Graphite.
A
You see the agent of the graphite right here?
B
Yeah, yeah. And it's a dream workflow. And so it's been the top pull request or. Sorry, the top feature request. Feature request. Thank you. GitHub for years from, from the community's perspective. I don't know at GitHub but. And then. So as soon as I joined the first. The first thing I did, I was going to look this up and well, that's the first thing I asked. How should we make it a better. And it was the top feature request, right? And then I went to go like, okay, investigate like any good product person would. And there's been multiple attempts at this internally going back to like 2020.
A
Okay.
B
And there was one very, very, very polished attempt to. In 2022 and it just. I don't have all the context so. But it was, it was. There was a pretty good implementation. All of the work was done on the client and it reintroduced this new concept called stacks outside the pull request into GitHub and it was a little too risky. It was sort of deemed too risky, too big of a change. That's just what I was told. So anyway, we're, we had a couple of meetings internally already and we're trying to weave it into planning and the roadmap and so hopefully we'll be able to share more updates soon. But like, it's a top of the list known feature again.
A
Heard.
B
Yeah. And so. And like we're working on it obviously like something the size of GitHub to move to like support this kind of new. This new feature is like not just like a walk in the park because of the size of GitHub's and GitHub's Git implementation, but it's something that we're actively exploring.
A
Yeah. Well, I think just to wrap all that up, I think it's really nice for someone who's so deeply engaged and coming from one of us, literally, that you now run things at GitHub and we can just add you. You can do that. And I think Ajay Karpathy the other day was saying every company needs one of these where you can just like, hey, this really should exist at GitHub. We love GitHub, we use GitHub, but come on.
B
Well, yeah. Feature request. Welcome. My DMs are always open.
A
Oh, air. I don't know hundred any developers.
B
I, I like whatever. I, I am of the philosophy that like all feedback is a gift, like it's all a signal.
A
Yeah.
B
And the more signal we can collect, the better decisions we can make and truly build this really, really useful website and company, like together. And that's going to be the future. And if we focus just on that, we're going to be okay.
A
Yeah, we're going to be okay. All right, well, thanks so much.
B
Yes, it's a pleasure. Awesome catching up. Yep, likewise.
A
Congrats.
Episode: ⚡ Inside GitHub’s AI Revolution: Jared Palmer Reveals Agent HQ & The Future of Coding Agents
Guest: Jared Palmer (SVP at GitHub, VP Core AI at Microsoft)
Date: November 10, 2025
This episode features a candid, wide-ranging interview with Jared Palmer, just 13 days into his dual leadership roles as SVP at GitHub and VP at Core AI, Microsoft. The discussion centers around the announcement of Agent HQ—GitHub’s ambitious new home for coding agents—exploring the evolution of AI-powered development tools, code generation models, and the emerging “Agent World.” The conversation also spans software workflows, model interoperability, developer pain points, and the importance of community-driven product iteration. Jared brings an insider’s perspective shaped by his journeys at Vercel, his work on V0, and his new mandate at GitHub.
| Timestamp | Segment | |--------------|---------------------------------------------------------------| | 00:00–01:41 | Introduction; Palmer’s background/transitions | | 02:34–08:27 | The origin story of coding agents, Vercel, and V0 | | 08:27–10:21 | V0’s focus, early model limitations, and rapid growth | | 11:12–13:36 | Model choice, composites, business/product strategy | | 13:36–15:11 | Agent architecture: chat vs. agents, tool orchestration | | 16:24–18:29 | Agent HQ, seamless workflow, product vision | | 18:29–21:10 | Dev Containers, repo setup pain, standards discussion | | 21:36–24:03 | Container standards, CODING agent evolution | | 24:03–26:31 | Reliability, observability, measuring model effectiveness | | 26:31–28:00 | Data analyst agents, coding agents for non-code tasks | | 28:00–29:55 | Browser-based agents, pros/cons, developer workflows | | 29:04–30:18 | GitHub homepage redesign | | 30:20–34:43 | Stacked diffs: what/why, history at GitHub, next steps | | 34:43–35:40 | Philosophy on feedback, closing remarks, open DMs |
“At GitHub, we’re the home of all languages and frameworks and developers, so the scope is broadened… it’s just a different part of the map, if you will.”
— Jared Palmer [01:46]
“One of the benefits of having your own branded model or synthetic or composite is that you can stitch these things together… you get to brand it. And you can decouple it from the launch of the Frontier Lab.”
— Palmer [12:18]
“All feedback is a gift, like it’s all a signal. And the more signal we can collect, the better decisions we can make and truly build this really, really useful website and company, like together.”
— Palmer [35:15]
“I think this is… this workflow where it’s just like seamless and fluid and you can stay in a flow state across… all devices, mobile web on GitHub.com, or in your local editor.”
— Palmer [17:48]
“We would do… every three hours rollup of key metrics and stats… like error-free sessions… especially now with agents which are multi-turn.”
— Palmer [25:22]
Jared Palmer's energy, deep experience, and community orientation permeate this episode. The GitHub Agent HQ launch signals a new phase where agent-based coding, model composability, and seamless developer workflows are converging at an unprecedented scale. Technical, business, and cultural tradeoffs are all on the table—as are the feedback loops that drive user-driven innovation. The AI engineer’s toolbox is expanding, and listeners are invited not just to adopt, but to participate in shaping what comes next.
For full show notes and resources: latent.space