Transcript
A (0:00)
Today on the AI Daily Brief, how the pros are vibe coding in 2026 and before that in the headlines, the last word on AI from Davos. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsor, Zencoder, Robots and Pencils and Super Intelligent. To get an ad free version of the show go to patreon.com aidaily brief or you can subscribe on Apple Podcasts and to learn anything else about the show, including how to sponsor it, how to try to come get me to jabber at you in person, or any of our various initiatives like the forthcoming AI DB Intel. You can find all of that information at aidailybrief AI now one more thing before we dive in. If you live anywhere from basically Texas to Maine, you are either in the midst of or just gotten out of one of the wildest winter storms we've had in some time. Where I am not only is school been canceled for Monday already, but we are actually dealing with a complete 36 hour travel ban with up to 2ft of snow anticipated. I am not counting on the power still being on and so for the sake of you guys not having to miss an episode and me not being stressed out by not being able to produce one, I'm actually recording this one on Saturday before it all hits. Still, there's a pretty good chance that with the chatter this weekend, especially the main episode, would have been Monday's main anyway. But that's the story. Without any further ado, let's dive in. Welcome back to the AI Daily Brief Headlines edition. All the daily AI news you need in around five minutes. Given that we are recording this one a little bit early, our main topic is actually a bit of a catch up on last week the World Economic Forum of course happened in Davos. All throughout last week we covered a couple of the big conversations, the AGI timeline conversation from Demis Hassabis and Dario Amadeus, among other things. But overall, what was the vibe there? I will say before I get into that, that I sometimes don't even want to cover this type of news because I think that more or less, for those of you who are just trying to understand what AI is going to mean for you, how it's going to impact your career, your company, your job, ignoring basically everything that happens in the types of conversations that go on at a place like Davos, ignoring all the conversation around markets and infrastructure buildouts and bubbles you'd basically be better off taking all of that time that you would spend thinking about what people were jabbering about and instead taking that time to just go figure out how to build with these tools. Yet, of course, we live in the world that we live in, and like it or not, the conversations that happen in Davos are a useful reflection on what global leaders think about this moment and so give us insight into the context in which this industry and this technology is going to operate. One side of the conversation was the voices coming from the tech industry. Reuters summed up that voice as jobs, jobs, jobs, the AI mantra in Davos as fears take a back seat. Now that is a specific reference to Nvidia's Jensen Huang, who basically made the argument that the amount of demand for chips, the infrastructure layer that needs to be built, the energy infrastructure that needs to be built to service it is all a big moment of job creation. And indeed I think it is the case that fairly uniquely, relative to other moments of creative destruction, even the transitional moment has the potential for a lot of creation as well. I think Jensen is right to identify that there is a lot more skilled labor outside of knowledge work that needs to be developed for this transition. In other places, tech leaders talked about the productivity benefits that they were seeing. Cisco talked about projects that had been too tedious to even contemplate before that could now be done in a couple of weeks. IBM's chief commercial officer Rob Thomas said that AI was at the ROI stage. He told Reuters, you can truly start to automate tasks and business processes. TechCrunch said that even though we anticipated AI being a big topic of conversation, the extent to which it shaped the event, with even the physical surrounding being dominated by tech companies and pavilions, was notable. And yet, of course, if the technology folks were excited, concerns about AI related job displacement were on the agenda as well. Christy Hoffman, the general secretary of the 20 million-member strong UNI Global Union, said AI is being sold as a productivity tool, which often means doing more with fewer workers. International Monetary Fund Managing Director Kristalina Georgieva called AI a tsunami hitting the labor market with the potential to transform or eliminate 60% of jobs in advanced economies and 40% globally. Now I remember a study from a couple of years ago from one of the big global institutions, IMF or World bank or one of them that basically had those numbers. So I assume that's what she's talking about, providing some bright spot. She thought that as high skilled workers see their wages rise because of AI, they would likely consume more in ways that benefited the local service economy, she said. One in 10 jobs is already enhanced by AI and the people in these jobs are paid better. When they're paid better, they spend more money in the local economy, they spend more money in restaurants here or there. Demand for low skilled jobs goes up and actually total employment seems to slightly increase because of it. Now for those who might be skeptical of this or seem like it feels relatively Pollyannish, there have been studies that have shown that, for example, in San Francisco, for each new local tech job, 4.4 jobs for positions like retail clerks, cooks, teachers and dentists is also created. At the same time, the IMF still has some big concerns. The two that stood out is stagnating middle class wages, especially for jobs that are not enhanced by AI, and increasing barriers to youth employment as AI takes over the entry level tasks. Now behind the scenes in Davos, there was also a lot of jockeying for position. The information wrote a piece all about how some Davos meetings were part of what seems to be a larger strategy for OpenAI to get more aggressive about its enterprise recruitment. Now, this effort was not strictly restricted to Davos. In fact, last week in San Francisco, Sam Altman hosted an extended business dinner with Disney CEO Bob Iger and other corporate execs. The Information writes that the gathering was intended to Preview a new OpenAI offering aimed at large companies, but they could not determine what that offering was. All that was happening while OpenAI COO Brad Lightcap and new Chief Revenue Officer Denise Dresser were schmoozing over in Davos. Clearly the company is trying to message that they are in fact not behind when it comes to enterprise. In a Davos session, OpenAI CFO Sarah Fryer said that by the end of the year, approximately 50% of their business will come from enterprise customers, and Sam Altman tweeted that they had added more than a billion in ARR over the last month just from their API business. Very clearly trying to shift the narrative, he says. People mostly think of us as ChatGPT, but the API team is doing amazing work. So what does this all add up to? It's kind of hard to tell. Part of the reason that we may not be able to have quite as strong a sense of what the general sentiment around AI was is just that there were of course, other more geopolitical conversations that made even the AI conversation take a back seat. I think, if anything, Jamie Dimon's crisp realism that no one can put their head in the sand that AI is not a force that is likely to be stopped, but that there could be challenges for how fast it's going to cause change in society that we may have to address. Might be a fairly good representation of the median. Mostly, it's kind of notable to me just how little the momentum cares. Going back to my initial point, if you mostly are interested in AI when it comes to how it's going to impact your life, let's just say you can safely switch from this headline section to what might be a much more pertinent main episode. If you're using AI to code, ask yourself, are you building software or are you just playing prompt roulette? We know that unstructured prompting works at first, but eventually it leads to AI slop and technical debt. Enter zenflow zenflow takes you from vibe coding to AI first Engineering. It's the first AI orchestration layer that brings discipline to the chaos. It transforms freeform prompting into spec driven workflows and multi agent verification where agents actually cross check each other to prevent drift. You can even command a fleet of parallel agents to implement features and fix bugs simultaneously. We've seen teams accelerate delivery 2x to 10x. Stop gambling with prompts. Start orchestrating your AI. Turn raw speed into reliable production grade output at Zenflow Free. Most companies don't struggle with ideas, they struggle with turning them into real AI systems that deliver value. Robots and Pencils is a company built to close that gap. They design and deliver intelligent cloud native systems powered by generative and agentic AI with focus, speed and clear outcomes. Robots and Pencils works in small, high impact pods. Engineers, strategists, designers and applied AI specialists working together to move from idea to production without unnecessary friction. Powered by RoboWorks, their agentic acceleration platform teams deliver meaningful results including initial launches in as little as 45 days, depending on scope. If your organization is ready to move faster, reduce complexity and turn AI ambition into real results, Robots and Pencils is built for that moment. Start the conversation@rootsandpencils.com aidaily brief that's robotsandpencils.com aidDaily Brief Robots and Pencils Impact at Velocity Today's episode is brought to you by my company, Superintelligent. In 2026, one of the key themes in enterprise AI, if not the key theme is, is going to be how good is the infrastructure into which you are putting AI in agents. Superintelligence Agent readiness audits are specifically designed to help you figure out one where and how AI and agents can maximize business impact for you and two what you need to do to set up your organization to be best able to leverage those new gains. If you want to truly take advantage of how AI and agents can not only enhance productivity, but actually fundamentally change outcomes in measurable ways in your business this year, go to be Super AI. Welcome back to the AI Daily Brief. Today we are doing a little bit of a catch up on the terms that you might have heard in passing, especially if you've been anywhere near AI Twitter X over the past couple of weeks. There are a few things that might sound like absolute Greek to you, but which combined tell the story of how vibe coding, which I really mean AI and agentic coding are evolving early into this year. Entrepreneur and content creator Riley Brown recently tweeted Cool Claude stuff, Remotion Skill, claudebot, C L A W D, Agent, SDK, Ralph and Cowork. Now if you are thinking I don't know what any of those things mean, don't worry, you are not alone and we're going to get into much of it today. The context of all of this is the big shift in perception over the last couple of weeks, which has been pretty well chronicled in episodes throughout this month. It wasn't that we got a new model or anything like that. It's that everyone went home for the holidays, had just a little bit of downtime to start playing around, started working on some personal or professional projects with Opus 4.5 or Claude code or 5.2 codecs or some combination thereof, and realized that what we could do with agentic coding was much, much farther than they might have thought. This was reinforced a couple weeks later when Anthropic dropped Claude Cowork, which is sort of like Claude code for the rest of us, and revealed that it had been written 100% by Claude code in just about 10 days. Now, if you want even more of a primer, I'd suggest one of my previous episodes why Everybody is Obsessed with Claude Code Claude Cowork is Claude code for everybody else. Or most recently and probably most importantly, why Code AGI is functional. AGI and it's here. So that's the setup and we just keep getting evidence of how much things have shifted. Cursor CEO Michael Trull posted about a week and a half ago. We built a browser with GPT 5.2 in cursor. It ran uninterrupted for one week. It's 3 million plus lines of code across thousands of files. The rendering engine is from scratch in rust with HTML parsing, CSS cascade layout, text shaping, paint, and a custom jsvm. It kind of works. It still has issues and is of course very far from WebKit and chromium parity. But we were astonished that simple websites render quickly and largely correctly. And to be clear, this was an experiment in autonomy. While at first blush people thought it was one agent writing 3 million lines of code, it wasn't. It was actually hundreds of concurrent agents. Cursor wrote it up in a blog post called Scaling Long Running Autonomous Coding, and it's very clear that Cursor is interested in pushing this frontier, they wrote. We've been experimenting with running coding agents autonomously for weeks. Our goal is to understand how far we can push the frontier of agent decoding for projects that typically take human teams months to complete. And indeed, if you want to take a step back and just try to understand psychologically where the vanguard of AI and agent decoders are right now, it is really all about pushing the boundaries on autonomy, breaking out, in other words, of being the bottleneck where without your consistent prompting, the AI isn't doing anything. The leading agent decoders are in the midst of trying to build systems that work all the time with extremely minimal input from them. They want nothing less than armies of agents that work while they sleep. And that army idea is operative in that same Cursor blog they write, today's agents work well for focused tasks, but are slow for complex projects. The natural next step is to run multiple agents in parallel, but figuring out how to coordinate them is challenging. Initially, Cursor gave their coding agents equal status and, as they put it, let them self coordinate through a shared file. Each agent would check what others were doing, claim a task and update its status. Ultimately, however, this failed. The locking mechanism they implemented to prevent two agents from grabbing the same task ended up becoming a bottleneck. As they put it, 20 agents would slow down to the effective throughput of two or three, with most time spent waiting. They tried a second strategy where agents could read state freely, but writes would fail if the state had changed since they last read it. In other words, they couldn't make different updates to the same code at the same time in an attempt to avoid conflicts. However, Kerser wrote, this didn't work either. As they put it, with no hierarchy, agents became risk averse. They avoided difficult tasks and made small, safe changes. Instead, no agent took responsibility for hard problems or end to end implementation. This led to work churning for long periods of time without progress. The next approach they took was to separate roles. Instead of a flat structure, they created a pipeline where a subset of agents, called planners, would continuously explore the code base and create tasks, and workers would pick up those tasks and focus entirely on completing them. The workers, they wrote, don't coordinate with other workers or worry about the big picture. They just grind on their assigned task until it's done, then push their changes. At the end of each cycle, a judge agent determined whether to continue. Then the next iteration would start fresh. This, they said, solved most of our coordination problems and let us scale to very large projects without any single agent getting tunnel vision. Now, this is the point at which they instituted this ambitious goal of building a web browser from scratch. Now, as we heard at the beginning, this worked, but not without a lot of challenges. They write, our current system works, but we're nowhere near optimal. Planners should wake up when their task's complete to plan the next step. Agents occasionally run for far too long. We still need periodic fresh starts to combat drift and tunnel vision. But the core question, can we scale autonomous coding by throwing more agents at a problem? Has a more optimistic answer than we expected. Hundreds of agents can work together on a single code base for weeks, Making real progress on ambitious projects. Now, one of the things that struck me as interesting when I was reading this was the way that they describe their planners and worker system. Swix shared this section of the blog post and nailed it when he wrote Cursor independently invented the Ralph Wiggum loop to solve the problems they were seeing with parallel agent orchestration. So this gets us to Ralph Wiggum, one of the weirder of these names, even if the concept itself isn't overly complicated. The concept was coined by developer Geoffrey Huntley. Actually, all the way back last July, he wrote a blog post called Ralph Wiggum as a software Engineer. And as he put it in his introductory blog post, in its purest form, Ralph is a bash loop. So you might ask, what the heck is a bash loop? First of all, bash is short for born again shell, which is a command line interpreter. That basically means it's the program sitting between a person and the operating system when they're working through a terminal. It reads the commands you type, it understands scripts, it runs programs and, and it handles things like variables and loops. A bash loop, then, is the way to tell a bash shell do this thing over and over until I say stop or until a condition is met. It's a way to automate repetitive command line tasks. Instead of copy pasting commands, simplifying it even more, it's a written instruction that tells the computer to repeat the same task over and over automatically. So let's use some analogies that aren't about coding. Imagine you leave a sticky note for an assistant that says, for each folder on my desk, open it, check what's inside, then move on to the next one. You didn't list every folder. You didn't do the work yourself. You just described the pattern once. That's an example of this type of loop. Another analogy would be a checklist with a rule. Instead of a bullet list that says rename file A, Rename file B. Rename file C. You say, rename every file in this folder the same way. The key idea is that this type of loop tells the computer what to repeat and when to stop. So the idea of RALPH as applied to AI coding was described by developer Ryan Carson in a post on X. He writes, everyone is raving about ralph. What is it? RALPH is an autonomous AI coding loop that ships features while you sleep. Each iteration is a fresh context window. Memory persists via git history and text files. Now he gets into exactly what this loop looks like from a technical perspective. But the Startup Ideas podcast with Greg Eisenberg had Ryan on to explain it even more simply. And here's how they summed it up. Step one, write a detailed prd that's a product requirements document, which is a document that defines the purpose, features, functionality, and behavior of any new project or feature. It's going to define why the product is being built, what success looks like, detailed requirements of what it should do, things like that. Now, after you write that detailed prd, you're going to convert it to extremely small, discrete atomic, to use their words, user stories. Step three is that for each of those atomic units, you add clear acceptance criteria. Step four is looping your AI agent through each story. In step five, it logs learning so it doesn't repeat mistakes. Step six, the person who initiated the RALPH loop wakes up, tests it, and fixes the edge cases. Basically, the idea is to break down a complex project into very discrete, smaller units that the coding agent can take on one by one, testing and looping until it's finished and moving on to the next. Now, people are still experimenting with this and figuring out the limits of the methodology. But the excitement on the other side is captured once again by Ryan in a post called how to Grow youw Startup While youe Sleep. And that really is the thing that people are so excited about. The idea of shifting to a paradigm where we've got agents just working for us in the background, meaningfully advancing the goals that we have. And yet, over the last week, and especially Weekend, the discussion has shifted from Ralph to something called claudebot, where the corresponding interest Believe it or not, in Mac Minis, viral memes abound, like this one from Flavio mom, how did we get so rich? Your father bought a Mac mini to run Claudebot in 2026. So what the hell is Claudbot? If you want to follow along at home, you can find this at Clawd Bot, which describes claudebot as the AI that actually does things. Clears your inbox, sends emails, manages your calendar, checks you in for flights. All from WhatsApp, Telegram, or any chat app you already use. It's basically a system that allows people to turn Claud Code into an actual personal assistant, a post on Starryhope.com reads. At its core, Claudebot is an open source AI agent that runs on your own hardware. Unlike ChatGPT or Claude's web interfaces, which process everything on remote servers, Claudebot operates locally with a gateway that connects AI models to the apps and services you already use. It can talk to you through WhatsApp, Telegram, Discord, Slack, Signal, and even iMessage. But the real magic is what it can do once it's running. Given the right permissions, claudebot can browse the web, execute terminal commands, write and run scripts, manage your email, check your calendar, and interact with any software on your machine. Perhaps the most compelling feature is that claudebot is self improving. Tell it you want a new capability and it can often write its own skill or plugin to make it happen. One user wanted access to university course assignments. He asked claudebot to build a skill for it. Claudbot did, and then started using it on its own. Now some are a little skeptical. Former Nvidia engineer Boyan Tungu said, I'm as excited as the next guy about the possibilities of Claudebot running on a cluster of small local minicomputers, but 99% of all use cases that I've seen so far concern the corporate bs jobs and tasks. Summarizing email posting on Slack, adding meetings to a calendar that shouldn't exist at all. This is not what has people excited, though, Nataliasson responded, saying, yeah, those uses are a waste of its potential. In my opinion. Nat would know because he went viral when he posted a picture of a Mac Mini about a week ago and said, hired my first employee today. He followed up, writing, yeah, this was 1000% worth it. Separate Claude that's the C L, a U D E version plus Claude the C L a W D managing Claude code and codec sessions I can kick off anywhere Autonomously running tests on my app and capturing errors through a sentry webhook, then resolving them and opening PRs. Basically, NAT has this set up to be working round the clock on a new agent that he's building to automate agency level content creation. On Saturday morning, Nat posted nothing like waking up to a report from claudebot about everything that went wrong in my app yesterday and what it already did to fix it. A couple hours later Nat was still going. He wrote built a customer success and support workflow for claudebot now too, analyzes transcripts from the day, emails customers with bad experiences, apologizing and asking for any other feedback. Adds their feedback to the daily report for our next morning brainstorm. Basically, he's got a digital employee that lives in a Mac Mini, uses Claude code opus 4.5 and codex 5.2 and which he communicates with via telegram. This is the type of capability that has people so excited right now. There were so many people in fact talking about putting claudebot on Mac Minis that they actually tweeted a PSA you do not need to buy a Mac Mini to run Clodbot. That dusty laptop in your closet works. Your gaming PC you feel guilty about works. A $5 a month VPS works. A Raspberry PI held together with hope probably works. Entrepreneur and investor Dave Marin wrote, at this point I don't even know what to call claudebot after a few weeks in with it. This is the first time I felt like I am living in the future since the launch of ChatGPT. Now, if all of this has your head spinning and it just seems technically inaccessible, you're not alone. Jasmine sun actually wrote a post called Claude Code Psychosis that talks about some of the ways that Claude code is still inaccessible for people. It's a nice counterweight because you can sometimes feel insane for being intimidated for something like the command line. I think the accessibility of these programs is going to change really really quickly though. Not only do you have Anthropic themselves releasing Claude Cowork, which while not there yet, is meant to be a new type of interface for non coding Claude code tasks. There are also other tools like Conductor that are replacing the Terminal interface with a gui. Nataliasson accidentally caused some controversy on Dan Shipper and Every's Vibe code camp when he said the CLI is the Stone age from two months ago. Gui's are back. He followed it up and said, I did not realize how controversial this would be. If you're still using Claude and Codex in the terminal, you're missing out you should absolutely be in Conductor. Other people agree Notions Brian Lovin said that on an average day he's spending 5% of his time in Figma, 15% in Cursor and Claude Code, 20% in Ghosty and 60% in Conductor. Lenny Ryczyczyki asked his followers what the most underhyped AI tools were and Conductor came in second behind only Whisper Flow, which is the one that I mentioned here a bunch of times. Speaking of Vibe Code camp, if you want to take everything I've talked about here and really start to go deep like I said, Dan Shipper and the team at Every recently did an eight hour live stream with tons of really great vibe coders talking about all the different things that they do. I'll include a link to the live stream as well as a summary app that someone built with all the takeaways from all the different people summing up really quickly. If you want to know in a very short statement how things are shifting this year and how the most successful vibe coders are trying to evolve. It's all about extending and expanding the autonomy of the agents that are doing the coding. It's about removing themselves as a bottleneck and seeing how much can happen in the background when they're doing other work or even when they're sleeping. Anyways, hopefully now some of these terms don't seem quite so crazy and inaccessible. I'm sure we'll continue to come back to them. For now, that is going to do it for today's AI Daily brief. Appreciate you listening or watching as always and until next time, peace.
