
Loading summary
Saad
Foreign.
Wix
Welcome to the Lead in Space podcast. This is Celestio, partner and CTO at Decibel and I'm joined by my co host, Wix, founder of Small AI.
Celestio
Welcome, welcome. And today in the studio we have a nice two guests from Klein, Pash and Saud.
Saad
That's right. Yes.
Wix
You nailed it. Let's go.
Celestio
I think that Klein has a decent fan base, but not everyone has heard of it. Maybe we should just get like an upfront, like what is Klein maybe from you? And then you can modify that as well.
Saad
Yeah, Klein's an open source coding agent. It's a VS code extension right now, but it's coming to JetBrains and Neovim and CLI. You give Klein a task and he just goes off and does it. He can take over your terminal, your editor, your browser, connect to all sorts of MCP services and essentially take over your entire developer workflow. And it becomes this point of contact for you to get your entire job done, essentially.
Celestio
Beautiful. Pash, what would you modify? Or what's another way to look at client that you think is also valuable?
Pash
Yeah, I think Klein is the kind of infrastructure layer for agents, for all open source agents. People building on top of this like agentic infrastructure. Klein is a fully modular system. That's the way we envision it and we're trying to make it more modularized so that you can build any agents on top of it.
Celestio
Yeah.
Pash
So with the CLI and with the SDK that we're rolling out, you're going to be able to build fully agentic systems for anything, not just coding.
Celestio
Oh, okay. That, that is a different perspective on client that I had. So, okay, let's, let's talk about coding first and then we'll talk about the broader stuff. You also are similar to adir. I don't know who comes first in that. You use the plan and act paradigm quite a bit. I'm not sure how well known this is like to me, I'm relatively up to speed on it, but again, maybe you guys want to explain why. Different models for different things.
Saad
Yeah, I'm going to take the cred for coming up with plan act first. Client was the first to come up with this concept of having two modes for the developer to engage with. So just in talking to our users and seeing how they use client where it was really only an input field, we found a lot of them starting off working with the agent, coming up with a markdown file where they ask the agent to put together some kind of architecture or plan. For the work that they want the agent to go on to do. And so we would find that people just came up with this workflow for themselves just organically. And so we thought about how we might translate that into the product. So it's a little bit more intuitive for new users who don't have to kind of pick up that pattern for themselves and can kind of direct and put in guardrails for the agent to adhere to these different modes whenever the user switches between them. So, for example, in plan mode, the agent's directed to be more exploratory, read more files, get sort of understanding, and fill up its context with any sort of relevant information to come up with a plan of attack for whatever the task is the user wants to accomplish. And then when they switch to act mode, that's when the agent gets this directive to look at the plan and start executing on it, running commands, editing files. And it just makes working with agents a little bit easier, especially with something like Client, where a lot of the times people's engagement with it is mostly in the plan mode, where there's a lot of back and forth, there's a lot of extracting context from the developer, you know, asking questions, you know, what do you want the theme to look like, what pages do you want on the website? Just trying to extract any sort of information that the user might not have put into their initial prompt. Once the user feels like, okay, I'm ready to let the agent go off and work on this, they switch to act mode, check auto approve, and just kick their feet up and, you know, get coffee or whatever and let the agent get the job done. So, yeah, most of the engagement happens in the plan mode and then act mode. They kind of just have a peripheral vision into what's going on, mostly to course correct whenever it goes in the wrong direction. But for the most part, they can just rely on the model to get it done.
Wix
And was this the first shape of the product, or did you get to the plan act iteratively and maybe was this the first idea of the company itself, or were you exploring other stuff?
Saad
It was a lot of, especially in the early days of Client, it was a lot of experimenting and talking to our users and seeing what kind of workflows came up that they found that were useful for them and translating them into the product. So plan and act was really a byproduct of just talking to people in our discord, just asking them what would be useful to them, what kind of prompt shortcuts we could add into the ui, I mean that's really all plan and act mode is. It's is essentially a shortcut for the user to save them the trouble of having to type out, you know, I want you to ask me questions and put together a plan the way that you might have to and some of the other tools you have to be explicit about. I want you to come up with a plan before acting on it or editing files. Incorporating that into the UI just saves the user the trouble of having to type that out themselves.
Wix
But you started right away as a coding product and then this was part of okay, how do we get better UX basically.
Saad
Exactly, yeah.
Wix
What was the model evaluation at the time? So I'm sure part of like the we need planning act is like maybe the models are not able to do it end to end. When you started working on that pairing, what were the model limitations? What were the best models? And then how has that evolved over time?
Saad
Yeah, when I first started working on Klein, this was I think 10 days after Cloud 3 or 5 Sonic came out. I was reading Anthropic's model card addendum and there was this section about agentic coding and how it was so much better at this step by step accomplishing tasks. And they talked about running this internal test where they let the model run in this loop where it could call tools. And it was obvious to me that okay, they have some version, they have some application internally that's really different from how the other things at the time were things like copilot and cursor and ator. They didn't do this sort of like step by step reasoning and accomplishing tasks. They were more suited for the Q and A and and one shot prompting paradigm. At the time, I think it was June 2024, anthropic was doing a build with Cloud Hackathon. So I thought, okay, this is a really cool new capability that none of the models have really been capable of doing before. And I think being able to create something from the ground up and take advantage of kind of like the nuances of how much the models improved in that point in time. So for example, Cloud3.5 was also really good at this test called Needle in a Haystack, where if it has a lot of context in its context window, for example, you know, 90% of its 200K context window is filled up. It's really good at picking out granular details in that context. Whereas before Cloud 3.5 it really pay a lot more attention to whatever was at the beginning or the end of the Context. So just taking advantage of kind of the nuances of it being better at understanding longer context and it being better at task by task, sorry, step by step, accomplishing tasks and building a product from the ground up just kind of let me create something that just felt a little bit different than anything else that was around at the time. And some of the core principles in building the first version of the product was just keep it really simple. Just let the developer feel like they can kind of use it however they want. So make it as general as possible and kind of let them come up with whatever workflows works well for them. People use it for all sorts of things outside of coding. Our product marketing guy, Nick Bauman, he uses it to connect to a Reddit MCP server, scrape content, connect it to an X MCP server, and post tweets, essentially. Even though it's a VS code extension and a coding agent, MCP kind of lets it function as this everything agent where it can connect to whatever services and things like that. And that's really a side effect of, of having very general prompts just in the product and not sort of limiting it to just coding tasks.
Pash
I was at a conference in Amsterdam and I built my whole presentation, my whole slide deck using this library. It's like a JavaScript library called slide Dev. And I just asked Klein, like, hey, like here's like my style guidelines. I wrote like a big Klein rules document explaining like how I want to style the presentation in Slide Dev. I told Klein, like, the agenda. I kind of recorded using this other app called Limitless, like transcribe my voice into text about, like my thoughts, just like stream of consciousness about what I was going to talk about for this conference, for my talk. And Klein just went in and built the whole, the whole deck for me. So, you know, Klein really can do anything in JavaScript. In JavaScript. Yeah.
Celestio
Yeah. So it's, it's kind of a coding use case.
Pash
It was kind of a coding use case, but then making a presentation out of it. But it can also like run scripts like do like data analysis for you and then put that into a deck, kind of combine things.
Saad
And being a VS code extension is kind of this, like, it gives you these interesting capabilities where you have access to the user's os, you have access to the user's terminal, and you can read and edit files. Being an extension, it reduces a lot of the onboarding friction for a lot of developers, or they don't have to install a whole new application or have to go through whatever internal Jumping through hoops to try to get something approved to use within their organizations. So the Marketplace gave us a ton of really great distribution and is sort of like the perfect conduit for something that needs access to files on your desktop or to be able to run things on your terminal, to be able to edit code and to take advantage of VS Code's really nice UI and show you like Diff Views, for example, before and after it makes changes to to files.
Celestio
Weren't you tempted to force VS Code though? I mean, you know, you could be sitting on $3 billion right now.
Saad
Well, no, I actually like pity anybody that has to fork VS Code because Microsoft makes it notoriously difficult to maintain these forks. So a lot of resources and efforts go into just maintaining keeping your fork up to date with all the updates that VS Code is making.
Celestio
I see. Is that because they have a private repo and just sync it, there's no like.
Saad
Exactly, exactly.
Celestio
And there's one of those kinds of open source projects.
Pash
Right.
Saad
And VS Code's moving so quickly where I'm sure they run into all sorts of issues, not just in, you know, things like merge conflicts, but also in the back end. They're always making improvements and changes to, for example, their VS Marketplace API. And to have to like reverse engineer that and figure out kind of how to make sure that your users don't run into issues using things like that is I'm sure like a huge headache for anybody that has to maintain VS Code fork. And it also being an extension also gives us a lot more distribution. It's not that you have to use us or somebody else. You can use Client in Cursor or in Windsurf or in VS Code. And I think Klein compliments all these things really well in that we get the opportunity to figure out and work really closely with our users to figure out what the best agentic experience is. Whereas Cursor and Windsurf and Copilot have to think about the entire developer experience, the inline code, edits, the Q and A sort of all the other bells and whistles that go into writing code. We get to just focus on what I think is the future of programming, which is this agentic paradigm. And as the models get better, people are going to find themselves using natural language, working with an agent more and more and less being in the weeds and editing code and tab autocomplete.
Pash
Yeah, just like imagine how many resources you would have to spend maintaining a fork of VS Code where we can just kind of stay focused on the core agentic Loop optimizing for different model families as they come out supporting them. You know, there's so much work that goes into all this that maintaining a fork on the side would just be such a massive distraction for us that I don't think it's really worth it.
Wix
I feel like when you talk I hear this distinction between we want to be the best thing for the future of programming and then also this is also great for non programming. Is this something that has been recent for you where you're seeing more and more people use the MCP servers especially to do less technical thing and that's an interesting area, or do you feel like programming is still the highest economic value thing to be selling today? I'm curious if you can share more.
Saad
In terms of economic value. Programming is definitely the highest cost of benefit for language models right now. I think we're seeing a lot of, you know, model labs recognize that OpenAI anthropic are taking coding a lot more seriously than I think they did a year ago. What we've seen is while yes, like the MCP ecosystem is growing and a lot of people are using it for things outside of programming, the majority use case is mostly developer work. There was an article on Hacker News a couple weeks ago about how a developer deployed a buggy Cloudflare worker and used a sentry MCP server to pull a stack trace and ask client to sort of fix the bug using the stack trace information, connect to a GitHub MCP server to close the issue and deploy the fix to cloudflare. All right, within client using natural language, never having to leave VS code and it sort of interacts with all these services that otherwise the developer would have had to have the cognitive overload of having to, you know, figure out for himself and leave his developer environment to. To essentially do what the agent could have done just all in the background, just using natural language. So I think that's kind of like where things are headed is the application layer being connected to sort of all the different services that you might have had to interact with before manually and it being this sort of single point of contact for you to interact with using natural language and you being less and less in the code and more and more a high level understanding of what the agent's doing and being able to course correct. I think that's another part of what's important to us and what's allowed us to kind of cut through the noise in this like incredibly noisy space is I think a lot of, a lot of people have really grand Ideas for, you know, where things are heading. But we've been really maniacal about what's useful to people today. And a large part of that is understanding sort of the limitations of these models, what they're not so good at, and giving enough insight into those sorts of things to the end developer so that they know how to course correct. They know how to give feedback when things don't go right. So, for example, Klein is really good about, you know, giving you a lot of insight into the prompts going into the model, into when there's an error, why the error happened, into the tools that the model's calling. We try to give as much insight into what exactly the model is doing at each step in accomplishing a task. So when things don't go wrong or it starts to go off in the wrong direction, you can give it feedback and course correct. I think the course correcting part is so incredibly important in getting work done. I think much more quickly than if you were to kind of give a background agent work. You come back a couple hours later and it's just totally wrong and it didn't do anything that you expected it to do. And you kind of have to retry a couple times before it gets it right.
Wix
I think the Century example is great because I feel in a way the mcps are like cannibalizing the products themselves. Like, I started using the Sentry MCP and then Sentry release here, which is like their issue resolution agent. And it was free at the start, so I turned it on in Sentry. I was using it. It's great. And then they started charging money for it. And I'm like, I can use the MCP for free. Put the data in my coding agent and it's going to fix the issue for free and send it back. I'm curious to see, especially in coding where you can kind of have this closed loop where, okay, are these. MCP is going to become the paid AI offering so that then you can plug it in and is client going to have kind of like a MCP subscription where, like, you're kind of fractionalizing all these costs? To me today, it feels like it doesn't make a lot of sense the way they're structured.
Pash
Well, yeah, we were like, very early on. We've been bullish on MCP from the very beginning and. And were you a launch partner?
Saad
Funny story about mcp, I think.
Celestio
Sorry to interrupt.
Pash
Yeah, no worries.
Saad
I think when Anthropic first launched MCP and they made this big announcement about this new protocol, that they've been working on and open sourcing it. Nobody really understood what it meant. And it took me some time really digging into their documentation about how it works and why this is important. I think they kind of took this bet on the open source community contributing to an ecosystem in order for it to really take off. And so I wanted to try to help with that effort as much as possible. So for a long time most of client's system prompt was how does MCP work? Because it was so new at the time that the models didn't know anything about it and how to make MCP servers. So if the developer wanted to make something like that, it'd be really good at it. And I'd like to think that client had something to do with how much the MCP ecosystem has grown since then and just getting developers more insight and sort of awareness about how it works under the hood, which I think is incredibly important in using it, let alone just developing these things. And so yeah, when we launched MCP Incline, I remember our discord users just trying to wrap their heads around it. And in seeing clients build MCP servers from the ground up, they were like, okay, they started to connect the dots. This is how it works under the hood, this is why it's useful. This is how agents connect to these tools and services and these APIs and sort of saved me a lot of the trouble of having to do this sort of stuff myself.
Pash
Those were like the early days of MCP when people were still trying to wrap their heads around it and there was like a big problem with discoverability. So back in like February, we launched the MCP Marketplace where you could actually go through and have like this one click install process where client would actually go through looking at a readme and that's like linked to a GitHub, install the whole MCP server from scratch and just get it running immediately. And that was like, I think around that time that's when MCP really started taking off with the launch of the marketplace where people were able to discover MCPs contribute to the MCP marketplace. We've listed over 150 MCP servers since then. And the top MCPs in our marketplace have over hundreds of thousands of downloads, people using them. And you know, there's like really notable examples where you mentioned like how are people? Like it's like kind of eating existing products. But at the same time we're starting to see like this ecosystem evolve where people are monetizing MCPS. Like a notable example of this is 21st Dev Magic MCP server, where it injects some taste into this coding agent, into the LLM, where they have this library of beautiful components and, and they just inject relevant examples so that Klein can go in and implement beautiful UIs. And the way they monetize that was like a standard API key. So we're starting to see developers really, like, take mcps, build them in, have distribution platforms like the MCP Marketplace incline and monetize their whole business around that. So now it's like almost like you're selling tools to agents, which is a really interesting topic.
Wix
And you can do that because you're in VS code, so you have the terminal so you can do npx, run the different servers. Have you thought about doing remote MCP hosting? Or do you feel like that's not something you should take over?
Pash
Yeah, we haven't really hosted any ourselves. We think that's, we're looking into it. I think it's all very nascent right now, the remote mcps. But we're definitely interested in supporting remote mcps and listing them on our marketplace.
Saad
And another part, I think with sort of local MCP servers and remote MCPs is most of the remote MCPs are only useful to connect to different APIs. But that's only a, you know, that's only a small use case for mcps. A lot of mcps help you connect to different applications on your computer. For example, there's like a Unity MCP server that helps you create, you know, 3D objects right from within VS code. There's an Ableton MCP server so you can like make songs using something like Client or whatever else uses mcps. We won't see a world where these MCP servers are only hosted remotely. There will always be some mix of local MCP servers and remote MCP servers. I think the remote MCP servers do make the installation process a little bit easier with, you know, with something like an OAuth flow and just authenticating a little bit. Not as painful as having to manage API keys yourself, but for the most part, I think the MCP ecosystem is really in its earlier days. We're still trying to figure out this good balance of security, but also convenience for the end developer so that it's not a pain to have to set these things up. And I think we're still in this very much experimental phase about how useful it is to people. And I think now that it is seeing this level of market fit and people are coming out with these sorts of articles and workflows about how it's totally changing their jobs. I think there's going to be a lot more of resources and efforts that go into the ecosystem and just building out the protocol, which I think there's a lot on Anthropic's roadmap and I. I think the community in general just has a lot of ideas and our marketplace in particular has, has given us insights into some ways that we could improve it. Things that, you know, developers have asked for from it. That where we're kind of thinking about how do we, you know, what is the ISP marketplace of the future look like? And for us it's going to be a combination of. Well, a lot of our users are very security conscious and there's a lot of ways that MCP servers can be pretty dangerous to use if you don't trust the end developer of these things. And so we're trying to figure out what does a future look like where you have some level of confidence in the MCP servers you're installing. I think right now it's just, it's too early and there's a lot of trust in the community that I don't think a lot of enterprise developers or organizations are quite willing to do yet. So that's something that's top of mind for us.
Celestio
There's an interesting tension between the Anthropic and the community here. You basically kind of have a model MCP registry internally, right? Honestly, I think you should expose it. I was looking for it on your website and you don't have it. Like the only way to access it is to install client. But there's others like Smithery and all the other guys. Right. But then Anthropic has also said they'll launch a model registry at some point.
Saad
Or MCP registry at some point, Some point.
Celestio
If Anthropic launched the official one, would they just win by default?
Wix
Right.
Celestio
Because would you just use them?
Saad
I think so. I think the entire ecosystem will just converge around whatever they do. They just have such good distribution and.
Celestio
They came up with it.
Saad
Yeah, exactly.
Celestio
Cool. And I noticed that you had some really downloaded mcps. I was going by most installs. I'm just going to read it off. You can stop me anytime to comment on them. So top is file system MCP makes sense. Browser tools from Agent Desk AI. Don't know what that is. Sequential thinking. That one came out with the original MCP release. Context 7. I don't know that one.
Pash
That's a big one.
Celestio
What is that?
Pash
Context 7 kind of helps you pull in documentation from anywhere and it has this big index of all of the popular libraries and documentation for them. Okay. And you can. Your agent can kind of submit like a natural language query and search for any document.
Celestio
Everyone's docs.
Pash
Yes.
Celestio
And apparently Upstash did that, which is also unusual because Upstash is just normally redis. Get tools. That one came out originally. Fetch. Browser use. Browser use, I imagine, competes with browser tools. Right. I guess. And then below that, play competition. Playwright. Right. So there's a lot of like, let's automate the browser and let's do stuff. I assume for debugging Fire Crawl Puppeteer. Uh, Figma. Here's one for you. Perplexity Research. Is that yours?
Pash
Um, well, yeah, I forked that one and listed it. But yeah, that's, you know, that's another very popular one where you can research anything.
Celestio
People want to emulate the automate the browser. I'm just trying to learn lessons from what people are doing. Right. They want to automate the browser, they want to access git and file system, they want to access docs and search. Anything else that you think like is notable.
Pash
There's all kinds of stuff where it's like, there's the Slack MCP where you can send. That's actually one workflow that I have set up where you can automate repetitive tasks in client. So I tell a client, okay, pull down this pr, use the GH command line tool, which I already have installed using the terminal to pull the pr, get the description of the pr, the discussion on it, and get the full diff as a single command, non interactive command. Pull in all that context, read the files around the diff, review it, ask a question like, hey, do you want me to approve this or not with this comment? And if I say yes, approve it and then send a message in Slack to my team using the Slack MCP for example.
Celestio
Oh, use it to write.
Pash
Yes.
Celestio
I would only use it to read.
Pash
Yeah, no, it's, you know, people like, I love it. You know, I love being able to just like send an automated message in Slack or whatever. You can also like set it up, like set up your workflow however you want. Where it's like, okay, Klein, please ask me before doing anything. You know, just make sure you're asking me to like approve before you send a message or something like that.
Celestio
Yeah. Okay. Just. Just to close out MCP side. Anything else interesting going on in MCP Universe that we should talk about? MCP Auth was recently ratified.
Pash
I think monetization is a big question right now for The MCP ecosystem, we've been talking a lot with Stripe. They're very bullish on MCP and they're trying to figure out like a monetization layer for it, but it's all so early that it's kind of hard to really even envision where it's going to go.
Celestio
Let me just put up a strawman and then you can tell me what's wrong with it. Like, how is this different from API monetization? Right. Like you sign up here, make an account, I give you a token back and then you use token, I charge you against your usage.
Pash
No, like, like, I think that's how it is right now. That's how like the, the magic MCP, the 21st dev guys did it. But we're kind of envisioning a world where agents can pay themselves for these MCP tools that they're using and pay for each tool call. And you can't deal with like a million different API keys from different products and like signing up for all this. There needs to be like a unified kind of payment layer. Some people talk about like stablecoins, how like those are coming out now that agents can natively use those. Stripe is. They're considering this like abstraction around the MCP protocol for payments, but like I said, it's kind of hard to really tell where it's going to. How that's going to manifest.
Celestio
I would say, like I covered when they launched their agent toolkit last year. A few months ago it seemed like that was enough. Like it. You didn't seem to need stablecoins except for the fact that they take like 30 cents every transaction.
Pash
Yeah.
Wix
Have you seen people use the X 402 thing by Coinbase to make. It's basically like the. You can do a HTTP request that includes payment in it.
Celestio
What?
Pash
Yeah, yeah, it's a. It's been around forever. The 402 error, that's like payment not accepted or something. Right. So yeah, we've seen some people talking about that, like more like natively building that in. But yeah, no one's really doing that right now.
Celestio
Anything you're seeing on like are people like making MCP startups that are interesting.
Wix
Mostly around rehosting local ones and do remote and then basically do. Instead of setting up 10 MCPs, you have like a canonical URL that you put in all of your tools and then expose all the tools from all the servers. Yeah, there's like MCP run some of these tools. But I think it kind of has the same issues of how do you incentivize people to make better MCPs?
Celestio
You know, will it be mostly first party or will it be third party? Like your perplexity MCP was the four tone. What was wrong with the perplexity one?
Pash
With MCPs and installing them locally on your device, there's always a massive risk associated with that. And when an MCP is created by someone that we have no idea who they are, at any point, they might update the GitHub to introduce some kind of malicious stuff. So even if you verified it when you were listing it, you might change it. So I ended up having to fork a few of those to make sure that we lock that version down.
Celestio
Okay, so this is just like you're just forking it so that you don't change. It's interesting. These are all the problems of a registry, Right. That you need to ensure security and all that. Cool. I'm happy to move on. I would say, like, the last thing that's kind of curious is like, if Anthropic hadn't come along and made mcp, what would have happened? What's the alternative history? Would you have come with mcp?
Saad
So we saw some of our competitors who have kind of working on their own version of plug and play tools into these agents. They kind of had to natively create these tools and integrations themselves directly into their product. And so I think anybody in the space would have had to just do the laborious work of having to recreate these tools and integrations for. So I think Anthropic just saved us all a lot of trouble and tapped into the power of open source and community driven development and allowed individual contributors to make an MCP for anything people could think of and really take advantage of people's imagination in a way that I think is necessary right now for us to really tap into full potential of this sort of thing.
Wix
We've had, I think, a dozen episodes with different coding products.
Celestio
And this, by the way, this episode came directly after he tweeted about Claude the Clock episode, where they're sitting right where you're sitting.
Wix
Thanks for sharing that.
Pash
On a rag. Yeah.
Wix
Can you give people maybe the matrix of the market of you have fully agentic. No ide. You have agentic plus ide, which is kind of yours. You have ide with some copiloting. How should people think about the different tools and what you guys are best at, or maybe what you don't think you're best at?
Saad
I think what we're best at and like our ethos since the Beginning is just meet the developers where they're at today. I think there is a little bit of insight and handholding these models need right now. And the IDE is sort of the perfect conduit for something like that. You can see the edits it's making, you can see the commands that it's running, you can see the tools that it's calling. It gives you the perfect UX for you to have the level of insight and control and be able to course correct the way that you need to to work with limitations of these models today. But I think it's pretty obvious that as the models get better, you'll be doing less and less than that, less and less of that, and more and more of the initial planning and prompting and sort of have the trust and confidence that, you know, the model will be able to get the job done pretty much exactly how you want it to. I think there will always be a little bit of a gap in that these models will never be able to read our minds. So there will have to be a little bit of, you know, making sure that you give it the most comprehensive and sort of like all the details of what you want from it. So if you're a lazy prompter, you can expect a ton of friction and back and forth before you really get what you want. But I think we're all learning for ourselves as we work with these things. Kind of the right way to prompt these things and to be explicit about what it is that we want and kind of how they hallucinate the gaps that they might need to fill to get to the end result and how we might want to avoid something like that. So what's interesting about cloud code is there isn't really a lot of insight into what the agent's doing. Kind of gives you this, like, checklist of what it's doing holistically at a high level. I don't think that really would have worked well if the models weren't good enough to actually produce work that people were generally happy with. We're kind of there and I think the space has to catch up to. Okay, maybe people don't need as much insight into these sorts of things anymore. And they are okay with letting an agent kind of get the job done. And really all you need to see is sort of the end result and tweak it a little bit before it's really perfect. And I think there is going to be different tools for different jobs. I think something like totally autonomous agent that you don't have a lot of Insight into is great for maybe scaffolding new projects, but for kind of the serious, more complex sorts of things where you know, you do need a certain level of insight or you do need to kind of have like more engagement, you might want to use something that does give you some more insight. So I think these sorts of tools complement each other. So for example, writing tests or spinning off 10 agents to try to fix the same bug, you know, might be useful for a tool that doesn't require too much engagement from you. Whereas something that requires a little bit more creativity or imagination or extracting context from your brain requires a little bit more of insight into what the model is doing. And a back and forth that I think Klein is a little better suitable.
Pash
Like visibility into what the agent is doing. That's like one axis and then another is autonomy, like how, how automated it is. And we have a category of companies that are focusing more on the use case of people that don't even want to look at code, which is like, you know, the lovables, the replits, where it's like you go in, you build an app, you might not even be technical and you're just happy with the result. And then you have kind of stuff that's kind of like a hybrid where it's, you know, for engineers, it's built for engineers, but you don't really have a lot of visibility into what's going on under the hood. This is like for like the vibe coders where they're, you know, fully, you know, letting, letting the AI take the wheel and building stuff very rapidly. Lots of open source fans and you know, people that our hobbyists enjoy coding in this, in this manner and it is really fun. And then you get to like serious engineering teams where they can't really give everything over to the AI, at least not yet. And they need to have high visibility into what's going on every step of the way and make sure that they actually understand what's happening with their code. You're kind of handing off your production code base to this non deterministic system and then hoping that you catch it in review if anything goes wrong. Whereas personally the way I use AI, the way I use Klein is I like to be there every step of the way and kind of guide it in the right direction. So I know every step of the way, like as every file is being edited, I prove every single thing and make sure that things are going in the right direction. And I have a good understanding as things are being developed, where it's going so, like, this kind of hybrid workflow really works for me personally. But sometimes if I want to go full YOLO mode, I go ahead and just auto approve everything and just step out for a cup of coffee and then come back and review the work.
Wix
My issue with this, as an engineer myself, is that we all want to believe that we work on the complex things. How have you guys seen the line of complex change over time? I mean, if we sat down having this discussion 12 months ago, Complex was much easier than today for the models. Do you feel like that's evolving quickly enough, that in 18 months it's like you should probably just do, follow gentec for like 75% of work, 80% of work, or do you feel like it's not moving as quickly as you thought?
Saad
I think what was complex a couple years ago is totally different to what is complex today now. I think what we need to be more intentional about are the architectural decisions we make really early on and how the model kind of builds. On top of that, if you have kind of a clear direction of where things are headed and what you want, you kind of have a good idea to about how you might want to lay the foundation for the code base that you're producing. And I think what we might have considered complex a few years ago, algorithmic challenges, that's pretty trivial for models today and stuff that we don't really necessarily have to think too much about anymore. We kind of give it a certain expectation or unit test about what we want, and it kind of goes off and puts together the perfect solution. So I think there's a lot more thought that has to go into tasteful architectural decisions. That really comes down to you having experience with what works and what doesn't work, having a clear idea for the direction of where you want to take the project and your vision for the code base. Those are all decisions that I think is hard to rely on a model for because of its limited context and, and its inability to kind of see your vision for things and really have a good understanding of what you're trying to accomplish without you putting together a massive prompt of everything that you want from it. I think what we spent most of our time working on a couple years ago has totally changed and I think for the better. I think architectural decisions are a lot more fun to think about than putting together algorithms.
Pash
It kind of frees up the senior software engineers to think more architecturally, and then once they have a really good understanding of what the current state of the repository is, what the current state of the architecture is. And when they're introducing something new, they're really thinking at an architectural level and they articulate that decline. And that's also. There's like some skill involved there and some of that can be mitigated with like asking follow up questions, being proactive about clarifying things on the agent side. But ultimately you need to articulate this new architecture to the agent and then the agent can go down and down into the mines and implement everything for you. And it is more fun working that way. Like, personally, like, I find it a lot more engaging to just think on a more architectural level. And for junior engineers, it's a really good paradigm to learn about the code base. It's kind of like having a senior engineer in your back pocket where you're asking Klein like, hey, can you explain the repository for me? If I wanted to implement something like this, what files would I look at? How does this work? It's great for that as well.
Wix
If you're moving on from competition, I have one last question. Competition?
Celestio
Yeah.
Wix
So there's Twitter beef with root code. I just want to know where the backstory is because you tweeted yesterday, somebody asked root code to add Gemini CLI support. And then you guys responded, just copy it from us again. And they said, thank you, we'll make sure to give credit. Is it a real beef? No, a friendly beef.
Pash
Uh, I think we're all just having fun on the timeline. Um, there's, there's a lot of forks that.
Saad
Like 6,000 forks.
Pash
Yeah, there's like, if you search Klein in the, in the VS Code marketplace, it's like the, the entire page just like forks of Klein and there's like even forks of forks that, you know, came out and raised like a whole bunch of money and it's. Yeah, the top three crazy.
Saad
The top three apps in Open Router are all client and then client fork. Client fork. Yeah, it's funny.
Pash
Yeah. Billions of tokens getting sent through like all these forks. Um, there's like, there's like fork wars and 10,000 forks and all you need is a knife, you know, so. No, it's, it's exciting. I think they're all really cool people. We got people in Europe forking us. We got people in China making like a little fork of us. I think Samsung recently came out with like a, was a Wall Street Journal article where they're using Klein, but they're using like their own little fork of Klein that's kind of isolated. You know, we encourage it.
Wix
Do you have any regrets about being.
Saad
Open source or not at all. I think Klein started off as this, like really good foundation for what a coding agent looks like. And people just had a lot of their own really interesting ideas and spinoffs and concepts about, you know, what they thought, you know, they. That they wanted to build on top of it was. And just being able to see that and see the excitement around just in the space in general has just been, I think, inspirational and has helped us kind of glean insights into what works and what doesn't work and incorporate that into our own product. And for the most part, I think for the Samsungs and all the organizations where there is a lot of friction in being able to use software like this on their code bases, it reduces that barrier to entry, which I think is incredibly important when you want to get your feet wet with this whole new agentic coding paradigm that's going to completely upend the way that we've written software for decades. So in the grand scheme of things, I think it's a net positive for the world and for the space. So no regrets.
Pash
In a lot of ways it's, you know, it's us and the forks. We were kind of there originally when we were like the only ones with this like, philosophy of keeping things simple, keeping things down to like the model, letting the model do everything, not cutting on, not trying to make money off of inference, going context heavy, reading files into context very aggressively, and kind of going back to cloud code. I was actually like, it was really nice to see that they, they came out and they validated our whole philosophy of like keeping things as simple as possible. And that kind of goes in with like the whole RAG thing, which is like rag was this early thing in like 2022, you started getting these vector database companies context windows were very small. This was like a way of people called it, like, oh, you can give your AI infinite memory. It's not really that, but that was like the marketing that was sold to the venture backers that were like investing in all these companies. And it became this narrative that really stuck around. And even now we get potential enterprise perspective. They're going through the procurement process and it's almost like they're going through a checklist asking, hey, do you guys do indexing of the code base and doing rag? And I'm like, well, why do you want to do this? I think Boris said it very well on this exact podcast where we tried rag. And it doesn't really work very well, especially for coding is like the way RAG works is you have to chunk all these files across your entire repository and chop them up into small little pieces and then throw them into this hyperdimensional vector space and then pull out these random chunks when you're searching for relevant code snippets. And it's like fundamentally it's so schizo. And I think it actually distracts the model and you get worse performance than just doing what like a senior software engineer does when they first they're introduced to a new repository where it's like you look at the folder structure, you look through the files, oh, this file imports from this other file. Let's go take a look at that. And you kind of agentically explore the repository. That's like we found that works so much better. And there's like similar things where it's like the simplicity always wins. Like this bitter lesson where fast supply is another example. So Cursor came out with this fast apply like they call the instant apply back in July of 2024 where the idea was models at the time were not very good at editing files. And the way editing files works in kind of the context of an agent is you have a search block and then a replace block where you have to match the search block exactly to what you're trying to replace and then a replace block just swaps that out. And at the time models were not very good. It was like, I forget, like GPT they were using under the hood at the time wasn't very good at formulating these search blocks perfectly and it would fail oftentimes. So they came up with this clever workaround to fine tune this fast apply model where they let these frontier models at the time, they let them be vague, they let them output those like lazy code snippets that we're all very familiar with, where it's like rest of the file here, like rest of the imports here. And then fed that into this fine tuned fast supply model that was probably like a Quinn 7B or something quantized very small, dinky little model. And they fed this lazy code snippet into this smaller model and the smaller model we fine tuned to output the entire file with the code changes applied. And one of the founders of Ader said this really well in very early GitHub discussions where he said like, well now instead of worrying about one model messing things up, now you have to worry about two models messing things up. And what's worse is the other model that you're giving that you're handing your production code to this like fastify model. It's like, it's a tiny model. Its reasoning is not very good. It's maximum output tokens. It might be 8,000 tokens, 16,000 tokens. Now they're training like 32,000 tokens maybe. And a lot of the coding files, like we have a file in our repository that's like 42,000 tokens long and that's longer than the maximum token output length of one of these smaller Fast supply models. So what do you do then? Then you have to build workarounds around that. Then you have to build all this infrastructure to like, pass things off. And then it's making mistakes. It's like very subtle mistakes too, where it's like, it looks like it's working, but it's not actually what the original Frontier model suggested. And it's like, slightly different. And it introduces all of these subtle bugs into your code. And what we're starting to see is as AI gets better, the application layer is reducing. You're not going to need all these clever workarounds. You're not going to have to maintain these systems. So it's really liberating to not be bogged down with RAG or with FAST Apply and just focus on this core agentic loop and maximizing Diff edit failures. Like in our own internal benchmarks, Cloudsonnet 4 recently hit a sub 5% or like around actually 4% diff at a failure rate. When Fast supply came out, that was way higher. That was like in the 20s and the 30s. Now we're down to 4%, right. And in six months, how does it go to zero? Well, it's going to zero. Like, as we speak, it's going to zero every day, you know, And I was actually talking with the founders of some of these companies that do Fast supply. They were trying to kind of work with us. Their whole bread and butter is fine tuning these fast supply models and, you know, like, relays and morph. And I had like a very candid conversation with these guys where I was like, well, there's a window of time where Fast Supply was relevant. Cursor started this window of time back in July. How much time do you think we have left until they're no longer relevant? Do you think it's an infinite time window? They're like, no, it's definitely finite. Like this. This era of fast appliance models is definitely coming to an end. And I was like, well, how long do you guys think they were? Like, maybe three months, maybe less. So I still think there's some cases where RAG is useful. You know, if you have a lot of human readable documents, a large knowledge base of documents where you don't really care about like inherent logic within them, like sure, index it, chunk it, do retrieval on it or FAST applies, like maybe if your organization you're forced into using like a very small model that's not very good at search and replace, like a deep SEQ or something, you know, maybe use a fast apply model.
Saad
I think Rag and Fast Apply were these just tools in a toolkit for when models weren't the greatest at large context or search and replace diff editing. But now they are extra ingredients that could make things go wrong that you just don't need anymore. There was an interesting article from Cognition Labs about you know, multi agent orchestration.
Celestio
And getting right into it. It's like you're on, you're on autopilot for us.
Saad
That's cool.
Pash
Yeah, I mean it's a great article by the way.
Saad
Yeah, it was great. They talked about how, you know, when you start working with different models, different agents, there's a lot that gets lost in the details and you know, the doubler in the details, that's those are the most important things and making sure that it doesn't, you don't have the agents running in loops and running to the same issues again and have sort of like all the right context. And so I think being close to the model, throwing all the context you need at it, not taking these cost optimized approach to pulling in relevant context using something like Rag or a cheaper model to apply edits to a file. I think ultimately yes, it's more expensive asking a model like Claude Sonnet to do sort of all these sorts of things to grep an entire code base and to fill up its entire context, but you kind of get what you pay for. And I think that's been another benefit of being open source is that our developers, they can peek under the kimono, they can see where their requests are being sent, what prompts are going into these things. And that creates a certain level of trust where when they spend 10, 20, $100 a day, they know kind of where their data is being sent, what model is being sent to, what prompts are going into these things. And so they get comfortable with the idea of spending that much money, get the job done.
Pash
Yeah, it's like not making money off of inference. I think the incentives are so they're so relevant in this discussion because you know, if you're incentivized to, you know, if you're charging you know, $20 per month and you're trying to make money on that. You, you're going to be offloading all kinds of important work to smaller models or optimizing for cost with rag like retrieval with rag not reading the entire file but maybe reading like a small snippet of it. Whereas if you're not making money off inference and you're just going direct, you know, users can bring their own API keys, well then all of a sudden you're, you're not incentivized to cut down on cost. You're actually incentivized just to build the best possible agent. And we're starting to see this trend of the whole industry is moving in that direction. Right? You're starting to see everyone open up to pay as you go models or pay directly for inference. And I think that is the future.
Wix
What's the client pricing business model right.
Saad
Now it's bringer an API key essentially just whatever pre commitment you might have to whatever inference provider, whatever model you think works best for your type of work. You just plug in Your anthropic or OpenAI or open router, whatever it is, API key into client and it connects directly to whatever model you select. And I think that level of transparency, that level of we're building the best product. We're not focused on sort of capturing margin on you know, the price obfuscation and clever tricks and model orchestration to you know, keep costs low for us and optimize for higher profits. I think that's put us in this like unique position to really push these models to their full potential. And I think that's shown, you know, I think that's, that's you get what you pay for, throw a task in client and, and it gets expensive.
Pash
But that's the cost of intelligence.
Saad
The cost of intelligence.
Wix
Yeah.
Saad
So yeah, the business model right now is you get to choose kind of where it's open source, you can fork it, you can choose where your data gets and you can choose who you want to pay. A lot of organizations we've talked to get some, you know, a certain level of volume based discounts with, with these providers and so they could, they can take advantage of that through client, which is helpful because Klang can get pretty expensive and.
Celestio
Yeah, wait, so I mean I'm still not hearing how you make money like you said, you don't. Huh? Why, why make money? Yeah, because you have to pay your salaries.
Pash
No, that's the, that's the, A lot of people ask us that and I always just Throw the why at them. But it's.
Celestio
You sound like the party fool, guys.
Pash
Party fool is like, the real answer is enterprise. So we can say.
Celestio
Because you're, you know, we release this when you launch it.
Pash
Yeah, yeah. So you want to talk about enterprise?
Saad
Yeah. I think being open source and bringing an API key has given us a lot of easy adoption in these organizations where things like data privacy and control and security are top of mind. And it's hard to commit to sending their code in plain text to God knows what servers, training their data to do. Training their data on models that might output their IP to random users. I think people are a lot more conscious about where their data is getting sent and what's being used to it. And so it's given us this opportunity to say, okay, nothing passes through our own servers. You have total control over the entire application where your data gets sent. And that's given organizations that we've been talking to over the course of the last couple of months this sort of like, easy adoption and I think this opportunity for us to work more closely with them and say, what are all the things that we can do to help with adoption in the rest of your organization? Essentially, how can we pour gasoline on sort of the evangelism that people have for Klein and these organizations and spread the usage of agenta coding, I think, at an enterprise level?
Pash
Well, yeah, what's. What's crazy is so we. We had. We open source Klein people really liked it. Developers were using it within their organizations. Their organizations were kind of like, reluctantly okay with it because they saw, like, we're open source and we're not sending our data, their data anywhere they could use their existing API keys. And then we launched, like, on our website, like a contact form for enterprise. Like, if you're interested in an enterprise offering, hit us up. And we had no real enterprise product at the time. And it turned out like we just got this massive influx of big enterprises reaching out to us. And we had a Fortune 5 company come up to us and they were like, hey, we have hundreds of engineers using Klein within our organization. And this is a massive problem for us. This is like a fire that we need to put out because we have no idea what API keys they're using, how much they're spending, where they're sending their data. Please just like, let us give you money to make an enterprise product. So the product kind of just evolved out of that.
Saad
Right? Right. I mean, it's. It really just comes down to more of listening to our users. So right after we put out this page. We just had a lot of demand for sort of like the table stake enterprise features, the security guardrails and governance and insights that sort of like the admins in these organizations need to. To reliably use something like Klein.
Wix
Yeah.
Saad
We've gotten a lot of people wanting us to sort of give them two things, invoices just to help with like all the budgeting and spending, the, you know, thousands of dollars.
Pash
All the Europeans. Yeah.
Saad
Just the other thing which I thought was a little bit surprising was some level of insight into the benefit that clients providing them. So it could be our sage or lines of code written because it allows these sort of like AI forward drivers for adopting these sorts of tools in these organizations to take that as a proof point and go to the rest of their teams and say, this is how much client's helping me. You need to start adopting this so we can keep up with the rest of the industry.
Celestio
This is for like internal champions to prove the roi.
Saad
Exactly.
Celestio
Okay.
Saad
Use as sort of evidence for this, you know, to justify the spend.
Celestio
Yeah.
Saad
But also to promote the product in.
Celestio
These organizations we can do this afterwards, but we'd like to talk to those and actually feature some of them what they're saying to their bosses on the podcast so that we can get a sense. Because like oftentimes we here we only talk to founders and builders of like the dev tool, but like not the end consumer. And actually we want to hear from them. Right. Like about how they're thinking about it, what they need. Kind of cool. One thing I wanted to ask to double click on is the relationship between open router and then like your. Your enterprise offering. Right. So my understanding is currently everything runs through open router.
Saad
Not everything. So you can bring API keys to OpenAI anthropic bedrock and then you have.
Celestio
A direct connection there, if the user.
Saad
Has a direct connection there.
Celestio
But everything else would run through open router. And so basically the enterprise version of client would be you have your own open router that you would provide visibility and control to that enterprise.
Pash
Yeah, that's for the self hosted option. Right. There's a lot of enterprises where they're okay with not self hosting, but as long as they're using their own Bedrock API keys and stuff like that. Whereas the ones that are really interested in like self hosting or like that want to be able to manage their teams, there would be like this internal router going on.
Celestio
The curious thing here is like, what if. What does model cost? Just go to zero. Like Gemini code. Just comes out and it's like, yeah guys, it's free.
Saad
Well yeah, that'd be great for us. So our, our thesis is inference is not the business.
Celestio
You would just never make money on inference.
Saad
Yeah, we want to give the end user total transparency into price into which I think is like incredibly important to even get comfortable with the idea of spending as much money as you do. I think the price obfuscation in this space has given developers this reluctance to opt into usage based plans. And we're seeing a lot of people kind of converge on this concept of okay, maybe have a base plan just to use the product, but sort of get out of the way of the inference and respect the end developer enough to give them the level of insight into not just the cost but the models being used and give them more confidence in spending however much it takes to get the work done. I think you can use tricks like Rag and Fast Apply and things like that to keep costs low, but for the most part there's enough ROI on coding agents where people are willing to spend yeah money to, to get the job done.
Pash
And for a truly like good coding agent, the ROI is almost hard to even calculate because there's so many things that I would have never even bothered doing. But then I now I have client and I could just like do this weird experiment or do this side project or you know, fix this random bug that I would have never even thought about. So like how do you measure that? Right.
Celestio
One variant of this problem we're about to move on to context engineering and memory and all the other stuff. One variant of this I wanted to touch on a little bit was just background agents and multi agents. So the instantiations of this now I would say are background agents would be codecs, for example, like spinning up one PR per minute or Devin or Cognition. So would you ever go there? That's one concrete question I can ask you. Would there be client on the server, whatever. And then the other version is still on the laptop but more sort of parallel agents like kind of. The Kanban is currently very hyped right now. People are making like Kanban interfaces for cursor and also for cloud code. Just anything like in the parallel or background side of things.
Pash
We're releasing a CLI version of Klein and using the CLI version of Klein. It's fully modular so you can ask Klein to run the CLI to spin up more clients or you could run client in some kind of cloud process, in a GitHub action, whatever you want. So the CLI is Really the form factor for these kind of fully autonomous agents. And it's also nice to be able to tap into an existing client CLI running on your computer and be able to take over and steer it in the right direction. So that's also possible. But what do you think, Saad?
Saad
I don't think it's an either or. I think all these different modalities complement each other really well. So the Codex, the Devins cursor's background agent, I think they all sort of accomplish the same thing. If we were to come out with our own version of it, I'd say that it would be the foundation for how other developers could build on top of it. So Nick's older brother Andre, he's sort of thinking 10 years ahead and it always kind of blows my mind a little bit about some of the ideas that he has about where the space is going. But we recently had a discussion about building this open source framework for coding agents for any sort of platform, building the SDK and the tool necessary to bring client to Chrome as an extension to the cli, to Jetbrains, to Jupyter notebooks, to your smart car, whatever it is, but to build your fridge. Your fridge, exactly.
Pash
Microwave maybe?
Saad
Yeah, exactly. I mean this is what we saw kind of like with the 6,000 forks on top of client is we sort of like put together this foundation for how this community of developers we sort of put together this foundation that this community of developers could build on top of and sort of take advantage of their experiments and imagination and their creativity about where the space is headed. And I think looking forward, building an open source foundation and the building blocks for how we bring something like client to things that go outside the scope of software development or VS code extension, I think that'll open up the door to things that ultimately complement each other really well. But it'll never be sort of this either or thing. I think background agents are good for certain kinds of work and parallel Kanban multi agents might be good for when you want to experiment and iterate on five different versions of how a landing page might look. And then something like a back and forth with a single agent like Klein works really well for when you want to, you know, pull context and put together a really complicated plan for a really complex task. And I think all these different tools will ultimately end up complementing each other and people will kind of develop a taste and an understanding for what works best for what kind of work. But I think something just looking 10 years ahead, we at the very least want to sort of be at the frontier of providing sort of the building blocks for what the next thing is. After background agents or you know, multi agents.
Celestio
I was going to go into context engineering. Kind of like topic du jour. I think that this is kind of similar ish in the thread to RAG and how RAG is a mind virus, which I love by the way that the way that you phrased it. Yeah. You have in your docs context management. You also have a section on memory bank which is kind of cool. I think a lot of people are trying to figure out memory. Let's just start at the high level and then we'll go into memory later. What does context engineering mean to you?
Saad
Context engineering mean to me means prompt engineering. Yeah.
Celestio
Right. So I think there is a lot of art to what goes in there. I think that really is like the 8020 of building a really good agent is like figuring out what goes into the context. I think interplay between MCP and your system client recommended prompts I think is what is ultimately making a good agent.
Pash
Yeah, I think context management is like one part of it is what you load in to context. The other part of it is how do you clean things up when you're reaching the context window. Right. How do you curate that whole life cycle from zero to maximum context window? And the way that I think about it is there's so many options on the table and there's so many risks to misdirecting the agents or distracting the agents. There's ideas about, you know, RAG or other kinds of forms of retrieval. That's. That's one idea. There's the agentic exploration. That's another idea that we found works much better. And it seems like the trend is generally for loading things into context. It's giving the model the tools that it can use to pull things into context. Letting the model decide what exactly to pull into context as well as some hints along the way. Kind of like a. Like a map of what's going on. Like ASTs, abstract syntax trees potentially what tabs they have open in VS Code. That was actually in our internal kind of benchmarking that turned out to work very, very well. It's almost like it's reading your mind when you have like a few tabs.
Celestio
Open it me out because like sometimes then I'm like, I have like unrelated tabs open and I have to go close them before I take off the thing.
Pash
I wouldn't think too much about especially when you're using Klein. Klein does a pretty good job of just Navigating that. But I definitely. There are edge cases, right? There's edge cases for everything. And it's kind of like, okay, what's like the majority use case is like, you know, when are you starting a brand new task and you don't have a single tab open that's relevant to it. Obviously in the CLI you might. You don't have that little indicator. So there you have to think outside the box for that. So that's like for reading things into context and then for context management is when you're approaching the full capacity of the context window is how do you condense that? And we've played around with this kind of naive truncation very early on where we just like throw out the first half of the conversation.
Celestio
That's common.
Pash
And there is problems with that, obviously, because it's like kind of like your halfway through a book and you're like, you start reading halfway through, right? You don't know anything that happened beforehand. And we like to think a lot about like, narrative integrity is like every task in Client is kind of like a story. It might be a boring story where it's like this lonely coding agent that's just, you know, determined to help you solve, you know, whatever it is. Like the child, like the big thing that the protagonist needs to overcome is like the resolution of the task. Right. But how do we maintain that narrative integrity where every step of the way the agent can kind of predict the next token, which is like predict the next part of the story to reach that conclusion. So we played around with things like cleaning up duplicate file reads. That works pretty well. But ultimately this is another case where it's like, well, what if you just give the model, like, what if you just ask the model, like, what do you think belongs in context? Another form of this is summarization, which is like, hey, summarize all the relevant details and then we'll swap that in. And that works really, really well.
Celestio
Yep. Double clicking on the AST mention. That's very verbose. When do you use that?
Saad
Right now it's a tool. The way that it works is when client wants. When client's doing sort of the agentic exploration of trying to pull in relevant context. And it wants to sort of get an idea of what's going on in a certain directory. For example, there's a tool that lets it pull in all the sort of language from a directory. So it could be the names of classes, the names of functions, and that gives it some idea of, okay, here's what's going on in this folder. And if it seems relevant to whatever the task is trying to accomplish is then it sort of like zooms in and starts to actually read those entire files into context. So it's essentially a way to help it kind of figure out how to navigate through large code bases.
Pash
Yeah, we've seen some companies working on. It's like an interesting idea, it's like an ast, but it's also a knowledge graph. And you can run these discrete deterministic, almost like actions on this knowledge graph where you could say like, hey, find me all the functions that. Find me all the functions in the code base and find me all the functions that aren't being used and delete all of them. And the agent can kind of reason in this, almost like SQL, like language working with this knowledge graph to do these kinds of global operations. Like right now, if you ask a coding agent to go through and remove all unused functions or do like some kind of large refactoring work, in some cases it might work, but very oftentimes it's just going to struggle a lot, burn a lot of tokens and fail ultimately. Whereas with these kinds of tools it can actually operate on the entire repository with these kinds of query, like short little query statements. I think there is a lot of potential in something like this where it's like the next level beyond the AST and it's like a language for querying this, this kind of knowledge graph. But like we've seen with, with like the Cloud 4 release is these frontier model shops, they tend to train on their own application layer and you might come up with like a very clever tool that in theory would work, work really well. But then it doesn't work well with Claude 4 because Cloud 4 is trained to grep, right? So that's another interesting phenomenon where it's like you're expecting these frontier models to become more generalized over time, but instead they're becoming more specialized and you have to support these different model families just.
Wix
To wrap on the memory side. Memory is almost the artifact of summarization. So you summarize the context and then you kind of extract some sides. Any interesting learnings from there, like things that are maybe not as intuitive, especially for code. I think people grasp the memory about humans, but what are memories about code bases and things look like?
Saad
I think memories right now for the large part are mostly useless. I think the kinds of memories that you might want the coding agent to hold onto are specific quirks about how your team works and the project or certain Rules only use Camel case, for example. It's better to place those sorts of things in a general guideline or rules file, for example. But I found that this idea of asking the agent, at least coding agents to hold onto certain memories about the project or how you work or things like that, you mostly have to force it to store those things into memory. And, and, and I don't think people, they don't want to have to think about those sorts of things. So it's something we're, we're thinking about is, is how can we hold on to the tribal knowledge that these agents learn along the way that people aren't documenting or putting into rules files without the user having to go out of their way to sort of force them to store these things into a memory database, for example.
Pash
Those are like kind of like workspace rules or tribal knowledge, like general patterns that you use as a team. But then there's like in our, we ran this like internal experiment where we built this to do list tool where it was only one tool where you could just write the to do and every time you could like rewrite the to do from scratch. And we would passively, as part of every, like, not every message but like every once in a while we would pass in this context of what the latest state of this to do list is. And we found that that actually keeps the agent on track after multiple rounds of context summarization and compaction. And it could all of a sudden build like an entire complex kind of task from scratch over, you know, 10x the context window length. And in internal testing this is like very, very promising. So we're trying to flesh that out and I think something like that. We had earlier versions of the memory bank which actually are like. Nick, Nick Bauman, our marketing guy, came up with this memory bank concept where it was like this Klein rules where he would tell Klein like, hey, whenever you're working, have the scratch pad of what you're working on. And this is like a more built in way of doing that. And I think that also might be very, very, very helpful for the agents to just have like a little scratch pad of like hey, what have I done so far? What's left? Specific app file mentions like what kind of code we're working on, general context and passing that off between sessions. Yeah.
Wix
Any thought on CloudMD versus AgentsMD versus AgentMD? I built an open source tool called Agents927, like the xkcd that just copy paste this across all the different file names so all of them have access to it. Do you think there should be a single file? Like, there's also, like, the ID rules versus the agent rules. There's kind of, like, a lot of issues.
Saad
I actually think it's fine that each of these different tools have their own specific instructions, because I find myself using a cursor rules and a client rules separately. When I want Klein, the agent, I want him to work a certain way. That's different than how I might want cursor to interact with my code base. So I think each tool is specific to the kind of work that I do, and I have different instructions for how I want these things to operate. So I think I've seen, like, a lot of people complain about it, and I get that it could make code bases look a little bit ugly, but for me, it's been, like, incredibly helpful for them to be separated.
Celestio
I noticed that you said him. Does Klein have Klein's Theater?
Saad
Yeah.
Celestio
Okay. Does he have a whole backstory personality?
Saad
So Klein is a play on CLI and Editor because it used to be.
Celestio
Claude Dev and now it's Client.
Saad
Yeah. I feel like Klein kind of stands out in the space for having. For being a little more humanized than something like, you know, a cursor agent or a co pilot or a cascade.
Celestio
And I think there's Devin, which is a real name, you know.
Pash
Well, yeah, Claude is a real name, I think.
Saad
Yeah. Yes, I've been. I've been. I think we've all been intentional about just sort of humanizing it because it, at least in working with. Kind of gives you more confidence in it and that I could, like, lean on it a little bit more. There's. There's kind of a. Of a trust building with. I think, with an agent. And the humanizing aspect of it, I think, has been helpful to me personally.
Pash
And this goes back to, like, the narrative integrity, which is. It's actually really important, I think, to anthropomorphize agents in general, because everything they do is like a little story. And without having a distinct kind of identity, you get worse results. And when you're developing these agents, that's kind of how we need to think about them. Right. We need to think that we're, like, crafting these stories. We're almost like Hollywood directors, right? We're. We're putting all the right pieces in place for the story to unfold. And yeah, having an identity around that is really, really important. And Klein, you know, he's a cool little guy. He's, you know, he's just a chill guy. He's a chill guy. He's helping us out. You know, he's always, like, happy to help, or he told him to not be happy. He could be very grumpy, you know, so that's great.
Celestio
Awesome. I know you're hiring. You have. You're 20 people now. You are aiming to 100. You have a beautiful new office. What's your best pitch for working a client?
Saad
A lot of our hiring right now is so far it's been just friends of friends, people in our network, people that we've worked with before, that we've trusted and that we know can show up for this incredibly hard thing that we're working on. And there's a lot of challenges ahead, and I think the problem space is probably the most exciting thing to be working on right now. Engineers in general love working on things that make their own lives easier, and so I couldn't imagine working on something more exciting than a coding agent. And that's a little bit biased, but I think a large part of it is it's an exciting problem space. We're looking for really motivated people that want to work on challenges like figuring out what the next 10 years looks like and building kind of the foundation for, you know, what comes next after background agents or multi agents and really help in sort of defining how all this shapes up. We have this, like, really excited community of users and developers. I think being open source has also created a lot of goodwill with us, where a lot of the feedback we get is, like, incredibly constructive and helpful in shaping our roadmap and the product that we're building and working with. A community like that is, like, one of the most fulfilling things ever. Right now, we're. We're kind of in between offices, but, you know, doing things like go karting and kayaking and things like that. So it's. It's a lot of hard work, but, you know, we make sure to. To have fun along the way.
Pash
So, yeah, no, like, Klein is a. It's a unique company because it really does feel like we're all just like, friends building something cool. And we work really, really hard. And the space is. It's not just competitive, it's like hyper competitive. There's, like, capital is flowing into all, every single possible competitors. We have forks of forks, like I said, raising tens of millions of dollars. And we're growing very rapidly. We're at 20 people now. We're aiming to be at 100 people by the end of the year. And being open source, it has its own challenges. It's like people, we do all this research, we do all this benchmarking work to make sure our diff editing algorithm is robust. The way we're working with these models to optimize for the lowest possible diff edit failures. And then we open source that, and then we post it on Twitter and someone's like, oh, thanks so much for open sourcing that I'm going to go and, like, raise a bunch of money with, like, our own product with it. But the way that I see it is like, this is, you know, let them copy. We're the leaders in the space. We're, we're kind of showing the way for the entire industry. And being an engineer and building all this stuff is super exciting. So working with all these people is just amazing.
Wix
Okay, awesome. Thank you guys for coming on.
Pash
Yeah, thank you so much.
Saad
So much fun.
Release Date: July 16, 2025
Host: Latent.Space (Celestio and Wix)
Guests: Saad and Pash (Cline)
This episode features the founders of Cline, an open-source coding agent that has quickly become popular among developers for its modularity and transparent approach. Cline is discussed in the context of the rapidly evolving AI agent ecosystem, open-source dynamics, agentic coding paradigms, and the economics of modern developer tools. The conversation explores Cline's philosophy, technical implementation, impact on developer workflows, integration with MCP servers, open source challenges, and vision for the future of software engineering with AI.
Built in the Age of Foundation Models:
Long Context Handling: Leveraged Claude 3.5's improved context window to let agents reason over large codebases.
Flexible Use Cases:
The episode maintains an open and pragmatic tone, candidly addressing technical obstacles, product philosophy, and the realities of running and monetizing open-source AI tools. The founders are transparent, critical of industry trends when warranted, and optimistic about open ecosystems and the future of agentic software engineering.