Transcript
A (0:01)
Today on the AI Daily Brief, how to build your agentic operating system. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, we are back with another operators bonus episode. Today we are once again joined by Nufar Gaspar who is here to introduce the latest free AIDB training program, Agent os. Now those of you who have done AI, DB New Year or maybe Claw Camp will be familiar with this style. It's a self directed build based program that's going to give you the tools you need to build a complete agentic operating system. And for those of you who are wondering what the difference between this and Claw Camp is, in many ways they share a lot of foundations. But whereas Claw Camp is of course focused exclusively on OpenClaw, this is an updated program that is meant to be platform model and harness neutral. Whether you're working with Claude Code or Codex or some other set of tools. Agent OS is about building dynamic and extensible and adaptable agentic systems that you can continue to evolve over time. But instead of me yapping about it, let's turn it over to Nuphar to introduce the program and get you building your agentic operating system. All right, Nuphar, welcome back to the show. We got a fun one today.
B (1:25)
Yes, we do.
A (1:27)
So this started kind of emerging around the time that I had started hacking around on Claw Camp. But before we dive in, we're going to talk about this Agent OS idea today. But give us a little bit of background around where the origin of this came from.
B (1:40)
Like most other people, I've been hacking around these systems for a few months, realizing that different tools like we will explore in a minute are converging and that I'm still not getting optimal results unless I'm being very deliberate about what I'm putting underneath. So basically that's the accumulation of all my thinking around how can you make any agentic tool work much better for you? So it's a framework that will not give any one novel concept, but the overall accumulation of everything creates a much better system for any tool, whether it's OpenCloud or any other agentic tool.
A (2:18)
Kind of moving from agents to agent systems almost. And you know, especially as, I mean literally as we're recording this, OpenAI just dropped their workspace agents. And it's just that the innovation is going to keep coming. I think the idea of having systems underneath that are extensible to the new platforms and kind of keep yourself organized are going to be increasingly valuable. So I'm Excited to dig into this
B (2:38)
with you, excited to share with everyone my best methods as of this morning, at least in today's episode. As mentioned, I will go over the full system that everybody who wants to get more out of the new wave of the agentic tools should build for themselves. And as a means of a background. What's happening today is that basically every agentic tool is becoming every agentic tool. You said it in one of the shows over the last few weeks and you're correct because Cursor just added agents and automations and Claude code added new memory systems also allows you to communicate with it from other channels. And OpenCloudIO files and codecs run in the background. And Hermes, the kind of up and coming open source from Nuos. They also have similar architecture. Whether it's wind cell phone or anti gravity or any of the numerous tools out there, they're all converging on the same set of capabilities. Which means that the tool you pick matters less and less. And what matters much more is the system that you build underneath it. So you want to build a system that captures how you work, what you know and what you need from the AI tool to do for you. And one important framing before we go any further is that most of the discourse around agentic tools is kind of focusing on coding. And today I want to focus primarily on knowledge work. So whether it's strategy, communication, operations, decision making, research, management and anything that the knowledge worker does, that is where the most professionals live and that is where the agentos or agentic operating system makes the biggest difference. And if some of these tools names are new to you, please don't disconnect. Everything that we're going to cover today boils down to human readable text files and configurations. So if you can write a document, you can build an agentic operating system. And I'm seeing people building incredible things with the new wave of these tools. Many of you are using them seriously and getting real results. But here's what kind of clearly differentiates how well any tool works for you. And that is the underlying system. Whether you've carefully built it, maintain it and improve it over time, that's the biggest unlock. And most people haven't. And it's not because they can't, but because they are building on instinct or best effort. And they end up getting suboptimal results from the tools that are capable to do so much more. So the proof that the underlying system is the thing that matters is every one of these agentic tools are basically doing the Same thing under the hood. They are reading text files that define who you are, what you know, what you can do and what you remember, and what you can reach. So almost everything that we're going to build today is a text file, and that's that. Which means that the work you do to build your system is portable. And while the tool owners and the tool companies might not want you to think so, they that's the reality of every tool becoming every tool. And when you switch tools or add a new one, all you have to do is point the tool to the same folder and it reads the same files. No migration, no rebuild. And the tool choice thus is becoming the least important decision, in my opinion, that you need to make. What does make a world of difference is what you build underneath. So the underlying system has a name and we're as mentioned calling it the Agentic operating system. And that's why we built a free program to help everybody build one for themselves, regardless of which tool they use, and more of that at the end. But a few things that you need to do and know today. First of all, is why the agent OS is the most important and most overlooked concept in practical AI. We then will go over quickly the seven layers and what actually goes in each one. And we'll talk about how to build the first version of this this week using the running example of a Chief of Staff agent. So your Agentic operating system has seven layers and you can see them on screen. And all of your agents will run on top of these layers. And each agent is basically inheriting the whole foundation. So the OS is what makes agents effective, regardless of which tool you use and regardless of how many agents you add over time. You build the OS once you maintain it. And then every agent you add gets better because the foundation is there. And we're going to walk through each layer so you will have better sense of what I mean by that. And before I go into the layers, just to make it more concrete, I want to build one specific agent as we go through the layers. And let's focus on a Chief of Staff agent. That will be the agent that reviews your inbox. You perhaps let it prepare you for a meeting, track every commitment you make across calls, maybe flag any blind spot that you have. This agent can perhaps draft your weekly updates, know your people, know your priorities, and so on. And this is one of the first agent, by the way, that I built for myself. I built Chloe. She runs on OpenClo and she's the front door to my entire system that I will briefly share later on. So of all the agents that you can build, the Chief of Staff is probably the one that helps you the most in the day to day. And whether you're an individual contributor and you are just getting started in your career, or you're a seasoned executive already used to managing a team of assistants, everybody would benefit from having a Chief of staff. And eventually your Chief of Staff can become the agent that manages the other agents. So that's what we will build today, layer by layer. The first layer will be identity. And that answers one question. Who are you and what rules do you want enforced? Every single time the agent talks to you, this is the file that your tool reads first. So it will read it before any question you type. And before memory, the the first thing. In Openclaw, it's called Sol. In cursor it's called Agents MD. In Claude code it's Claude. In GitHub, Copilot, it's Copilot instructions and so on. Different names, but the same idea. A text file that tells the tool. Who is it working for? If you've never proactively written this file, your agent starts from zero or what it was able to collect randomly along the way. And if you don't have a high intention, regularly update and just sufficient amount of information in the identity file and you are missing a huge opportunity for you to get so much more out of the agentic tool or the agent that you will build on top of these agentic tools. So you write it once and you enforce it forever. Or rather the tool does. And in terms of what goes into a good identity file, it has to include who you are, how you communicate, whether it's direct or diplomatic, bullets or pros, short or thorough, and so on, it needs to also include what you value. So whether you prefer concise versus lengthy, whether you prefer challenge your thinking versus execute what you say, show your reasoning, or just give you the answer, and so on. And whether you have specific rules. What AI should never do, for example, never send external email without showing me a draft, or never flatter me or always tell me what I'm not seeing. So that will be the identity. And here's the key to actually building a good identity file. You don't write the file from scratch yourself. You will hate it and you will quit. Instead, I encourage you to brain dump to an AI and let the AI interview you. So open any AI tool and ideally one that already has sufficient memory of you from ongoing work with you and say I'm building my AI identity file, please ask me 15 questions about how I work, what I want, what I don't want, what frustrates me about AI today and what tools I want enforced. You answer and ideally out loud because it's much easier to speak to an AI tool. And then the AI will draft, you will edit and you will shift a first version that let's say is about 70% right. You can patch it over the next three weeks as you notice the gaps. And basically this is the methodology for every layer that we will cover today. You brain dump. You let the AI interview you, you, you draft, you shift a minimal variable version of this file and then you improve and for our chief of staff identity will capture your communication style, your pet peeves, your non negotiables. For example, never let me walk into a meeting without a pre read or always tell me who else I owe to reply to or flag when I'm over committing for the next week. And then we have the context. The context is what you know. And what you know is the single biggest predictor of whether AI gives you the generic output or something genuinely useful for your actual situation. So a generic AI advice is just like one Google search away. And what you cannot get from the public Internet is your situation, your roadmap, your org chart, your customer segments, your priorities and so on. And unlike other layers, this is not something that will be solved by better models. Your specific context will always be yours and no model improvements will ever, ever know what you're shipping next quarter or who you are key stakeholders unless you tell it. And context files are the documents in your workspace that your agent read on demand. And they are not part of the prompt you type. They are the library that the agent can reach for when the task needs it. Now there is a big trap here. If you're trying to Context Engineer in one session you will produce a 40 page document and you never update that. That's not context. It's just a quick to be stale novel. So what actually works here is to have basically three to five focused files each on a single page. Each covers one thing, whether it's my team, my product, my customers, my quarter, my stakeholders and so on. Make it dated and fresh and update when things change. And often we call it context curation. And it's not a project that has a beginning or an end, it's a practice. And every time you catch yourself re explaining something about your situation to AI, that thing should have been in a context file, write it down and add it to the library. And then you Move on. For example, for our Chief of Staff, we will at the very minimum need a stakeholders file who reports to you, who you report to, key cross functional partners, what each one cares about, and so on. Strategy and priorities File what you're trying to achieve this year, what the organization is focused on, and an operating principle file how decisions get made, what you push back on, what you escalate, and so on. And I believe that context creation is the single fastest path to AI value. So the moment anyone gets this, whether it's you or people around you, this is where the conversation changes. They stop asking what AI tool should I use and start asking what knowledge do I have that isn't written down anywhere? And of course, if you want to go deeper on this layer, NLW did an amazing episode a few days ago called how to build a Personal Context Portfolio and MCP Server and it walks you through, hands on, including templates and companion apps. So he did such a good job that I'm going to let NLW take it from here. So identity is who you are, context is what you know and skills is how you work. And a skill is a reusable instruction set for the AI or workflow that you do repeatedly. It can be the weekly status updates or the meeting prep, stakeholder emails and so on. It can be decisions and memos, you name it. And I believe that every knowledge worker easily has 20 or 30 of these patterns and each one can be written as when I say some kind of a trigger, do some kind of a process using the following sources and produce output in this format. So without a skill you're basically re explaining the format every time you paste the same sources, every time you complain that AI writes in a weird voice and you never bother to teach it your voice. So a skill fixes that and if you write it once, well it fires forever. Again, I want to encourage an MVP skill. Not a perfect skill to begin with. The first version is always not perfect and kind of long, but you use it for a week and then you notice when it's off and what it needs to improve. You patch it and then a few weeks in. That skill is writing better first draft than you'd ever get from starting over each time. For our beloved Chief of Staff agent, a bunch of skills that you can consider will be pre read that will produce a one page pre read for any meeting. It can be daily brief that scans everything that is open on my inbox or slack or calendar and gives me what is on my plate for today. Voice match that helps AI Write like you commitment tracker and I can think of many other I'm sure you can. And of course if you want to go deeper on skills, we did a skill masterclass episode in this show a few weeks ago. Memory is the next layer and this is where every agentic tool company is probably investing extensively right now. And that is happening for a very good reason, because it is clearly one of the biggest unlocks. It's kind of the core part of what makes openclaw feel like magic. Claude code recently added auto memory and Cursor has project level memory and things are changing on a daily basis in the memory front. And because it's so important, every tool is racing to improve and they frequently copy each other's breakthroughs. So what's currently a limitation in one tool will probably be solved by the next time you go and look into a new tool release. So the question is, if so many experts are working on memory, what does it mean for you in practical terms? So first the good news. You can just lean on the existing memory in your tool and it will keep getting better for sure. But at least as of now, no matter which tool you're using and how well its memory is working, there are a few things that you should do. At the very minimum, I want you to understand how your tool's memory works. You can ask it directly explain how your memory system works. What do you remember between sessions? What do you forget? You need to know what you're working with before you can improve it. And also you need to know the limitations. Most tools still have gaps in cross session memory, in what they retain versus discard, and also in how context window interact with the stored memory. So by understanding the limitations you will get significantly better results. And for the more advanced users, I would encourage you to add specialized memory for your work contexts. So some people maintain a running log, others create structured memory files. Some people use dedicated memory tools or MCP servers. What I want you to be is to be deliberate about what gets remembered. Your agent does remember things on its own, but it doesn't always pick up the right things. So a major decision, a change in priorities or the end of a very long session might not be picked up properly how you would expect. So you should deliberately get the agent to remember that. And these are the things that you should make into a habit or even create a dedicated skill to help your agent remember the way you like it to remember. And there is a lot of variance between the tool. That's something that you need to know. So it's A layer that you can probably deepen over time. You don't need to solve it on day one, but the better the memory works and the better the overall experience becomes, because memory is what makes every other layer stick across sessions. If we're going with our chief of staff beyond the general memory, you might want to create dedicated memory for decision logs, for example, what was decided, why, what's the alternatives that we were contemplating, and so on. We might want to also include a dedicated memory for learning about the working processes, because you want the chief of staff to also own an ongoing improvement. And perhaps you want to remember relationship context, how conversation with a specific stakeholder went, what they reacted well to, and so on. And this is more structured than the generic memory that any tool that you use will provide and will really help your chief of staff agent to be much better companion and helper for you. So everything so far has made your AI smarter about you. We move to the next layer, which is connections, and that makes it capable of acting in the real world. And connections are how your agent reaches real systems. Whether it's email or calendar, Slack, Jira, Salesforce, your databases. There are various ways to get there. We have the MCPS or Model Context protocols. This is the open standard that many tools support. Recently we're seeing CLI tools that give your agent more judgment to decide how to interact with the external system on its own. And there's always the option of direct API or scripting to get connected. You don't have to go deep on that to be able to connect your agent, because we are already seeing that in many Tools, whether it's CloudCore, Desktop or Cursor Marketplace, they are making connections progressively easier to set out of the box. And of course with OpenCloud, there are many connections already enabled as well. So one thing that I want to say about connections is I want to encourage you to start as much as possible with a read only access before you let your agents write back into systems. Let the agents only read your calendar or only read your inbox, not let them send emails and add calendar and so on. Write access should be added after you watch the agent behave for a few weeks and you have enough trust. And it doesn't matter if it's OpenClo or one of the commercial tools. The reason why I'm saying that is that the risk scales with the capability. So the more your agents can do in real systems, the more you need to think about permissions and security. And this is real. We're already seeing incidents on a daily basis. It's not just data leaks in the traditional sense, but rather you can imagine an agent that has access to your company Slack and a very loose set of permissions. Someone on your team starts chatting with it and now the agent is happily sharing your private notes, your opinions about colleagues, your draft feedback. So it's not a hypothetical risk. Incidents like that are already happening. And the agents that are gossiping while being very funny, they also pose a very big risk for employee privacy. So use the least privileged connections. Talk to your IT team if you are connecting any work systems and don't be the one creating the cautionary tales for others in your company, specifically for your Chief of staff. At the very minimum, give it a read access to the calendar and inbox. And even better, you can give it a read and write access on personal task list. If you want to be extra good, you can give it permission to post draft on Slack or DM to yourself for approval before sending. Moving on to the next layer the worst thing that happens with your agent OS isn't that it fails, it's that it works confidently and wrongly and you ship the output before you noticed. And verification is knowing what to check. And every agent job has its own very quick test. If you draft emails, you need to basically tone match or you need to check that the facts are correct and so on. If you're doing data analysis you have to check the numbers. So it's very specific to the different tasks or skills that you give the agents. I encourage you to do at least three to five checks. In many cases it's under a minute to run and it will save you a lot of grief after. And they do get faster with practice. In the first week it might be slow and it might get a little bit frustrating, but the more time goes by and the more you trust your agent on low stakes you can verify just the high stakes one. And verification is not just about the individual output. You also need to improve the system over time. So periodically I want you to do a retrospective with your agents, audit the system and figure out which parts are under serving you. And maybe there are skills that are never being called, or maybe there are context files that become stale. Maybe some agents need updated instructions. So I want you to similarly to how you would reevaluate your employees or your company periodically you will do that also with your Agentix systems. You can do that either indirectly by auditing your each and every agent that you build on top of the system, or you can go directly to audit the OS layers themselves and the great thing about the tools is that they let you just do that directly from the tool. You can literally ask them what is not being used and so on. And the reason why I'm talking about this audit discipline in the agent OS program, it's because it's something that without it, your OS has a shelf life of maybe eight weeks before everything goes stale. And with it your OS compounds even further and forever. All right, top of the stack and a great addition, not a mandatory one, is to add automations. Those are things that the agents can run when you're not watching. Whether it's a daily summary, every morning at 7am, a monitoring task that is ping Slack or anything else that you can think of. They are very powerful. And of course with OpenCloud we also have Heartbeat and Cron jobs for the ones who are involved with that. However, this is the layer that creates a lot of risk if you're not careful. Because an agent that is running at 3am with the wrong answer can do damage before you wake up. So a few rules with regards to automations. Only automate workflows you have run manually enough times and trust. Second, I want you to start with automations that produce drafts for you to review, not outputs that go directly to other people. And lastly, always add logs. You need to know what ran and what it did as it was running. And finally, here's where the whole thing clicks. Once you build your os, agents become cheap because your first agent is hard. As you're building the agent OS and the agent itself probably at the same time, your chief of staff maybe took you a weekend. But the second agent that is built on top of this system, maybe it's a research agent or a board prep agent that takes you an afternoon because it inherits everything that is relevant and it already knows you and it knows your context, it knows your voice and you're only adding a job description and a few specific skills. And from here on, after your third, your fifth year and so on, they are each becoming faster and faster. So this is the compounding return and why I'm so bullish on the agentic operating system. And it matters more than the agents or the tools themselves. So this is my system. I mentioned Chloe, my chief of staff, she was one of the first that I built. After that I also built specialist agents for content, for technical building, for platform work and so on. They all share state to a central hub, they all share the same agentic operating system and my chief of staff sees the specialist and what they're doing. Using the shared hub for a different episode, but just showing that I practice what I preach. And this is exactly the sequence of what our Agentos freak program walks you through. 10. Build Project Any agentic tool that you want. Bring your own. It's totally free. And the link will be shared in the show notes and in the AIDB training website. And lastly, while the tools keep changing, they will keep converging. And the next one that we launch before you finish learning the current one, that doesn't matter. The part that does matter is that your OS travels with you. And every tool swap and every new agent and every new capability that drops next month, it all lands on the same foundation. The people who build that foundation now will basically have it compound from here on after, and everyone else will keep starting over with new tools. So if your question is, how do I roll this out across the entire organization, that's what we do with Enterprise Claw. But today was personal. And start with the Agentos program and that's that. See you in the build log.
