Transcript
A (0:00)
Today we are casually talking through 50 AI predictions for 2026. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG Blitzy, Super Intelligent and Robots and Pencils. To get an ad free version of the show, go to patreon.com aidaily brief or you can subscribe on Apple Podcasts. To learn about sponsoring the show, visit aidailybrief AI or send us a note at sponsorsidailybrief AI. And lastly, if you would like to learn more about our recently released AI ROI Benchmarking Survey or our forthcoming AI DB Intelligence Service, which includes original research information benchmarks, check it out@aidbintel.com all right friends, the time has come to shift from looking backward to looking forward and I'm thrilled to spend the next two days looking at AI predictions for 2026. Now originally I had intended this to be a single episode, but when I got to an hour and 47 minutes of raw recording, it was quite clear that two episodes was on the docket. For the visuals, I dumped my outline into both Gen Spark and to Manus to help produce this, and rather than picking one or the other, I decided I'm just going to go back and forth between them A so you can get a feel for how these various tools perform, but b To keep it a little bit more visually interesting, as this is a particularly talky type of episode, I've organized the predictions into about seven categories. Models and Capabilities Vibe Coding Enterprises plus Vibe Coding Enterprise Trends not including Vibe Coding, Competition Market and Politics. Now number three enterprises in Vibe coding probably could have just been in one or the other, but they were distinct enough that I decided to keep them independent. Let's kick off with models and capabilities. Broadly speaking I think that we are going to stay roughly on the meter line. Now this is obviously a genspark made up chart and the meter line I'm talking about is this one. This is the chart that measures the length of a task in human hours that different models can complete at 50 and 80% success rates. This line has been fairly consistent for some time now. For a while we saw capabilities doubling every seven months, and more recently it's jumped up to closer to four and a half months. You can see here the difference between the seven month line and the four month line on both the 50 and the 80% reliability threshold. Now it is at least theoretically possible that we see recursively self improving AI. But I think it's far more likely that the new Nvidia architecture, which is coming online in the form of Blackwell chips and then eventually Hopper chips, keeps us on something like this trajectory even as we max out capabilities and move them beyond human capacity in a lot of different areas. Next up I think we are going to get a lot more models a lot more frequently. GPT5 more than anything showed that there is just a ton of risk in building up big expectations around a single model release. Now yes of course Gemini 3 was kind of the opposite, but the hit to OpenAI and more broadly the entire AI field that GPT5 wrought probably could have been avoided by a different approach to release schedules. Of course, to be fair to OpenAI, they had released models in between. We had 03.04 mini, but they obviously had built a lot of expectations around their big 5.0 model. Subsequent to that we have gotten 5.1Then 5.1 Codex, then 5.2Then 5.2 Codex, all in very short order from one another. Anthropic of course was kind of already on this tip not only releasing more sub variations, but also splitting the releases of their Haiku, Sonnet and Opus versions in a way that took some pressure off of any one release. Now for all of us users this is gonna be a little bit of a double edged sword. On the one hand we are pretty constantly gonna have new toys to play with, but on the other there is going to be a never ending slate of new things to test and try and figure out if they actually improve upon the existing models for your particular use cases. What's more, I think especially when it comes to writing type tasks or just generally being smart, research, etc. Model upgrades are going to be increasingly vibe based. This is of course due to the fact that all of the premiere models are really good. Right now when I'm deciding between Gemini 3 Opus 4.5 and GPT 5.2 for some writing or research use case, it's largely going to be stylistic for me in use case by use case. Now what this may lead to for most users is just picking one that generally they like the vibes of best and sticking with it knowing that even if one of the other models gets ahead for a moment, there's probably a new release coming right around the corner that will get your preferred model back up to the state of the art. That said, because there's so much saturation and similarity around a lot of those base writing and thinking type of tasks, I think there's going to be A lot more emphasis on multimodal competition. Already you're seeing that nanobanana Pro, which Manus used of course to create these images, you can kind of tell felt every bit as significant to Google's second half of the year as did Gemini 3. And obviously OpenAI did not wait very long to respond, even moving up the release of their images 1.5 model. It is very clear that OpenAI is not seeding this. Even if Google does look like the juggernaut in this particular area, it's worth noting that Grok also is in seeding this, continuing to push both images and video. The only major lab that has very clearly taken themselves out of this particular race, which actually never entered it, is anthropic. Now, in addition to multimodal, I also predict that there will be a lot more emphasis on productization and the interface around models. Again, if you think that the models are pretty commensurate with one another and all kind of at the state of the art, then the choices you're going to make as a user of those models is going to shift to other areas, such as, for example, the user experience and how navigable they are, how much it helps you do what you need to do with them. I think the fact that OpenAI put a distinct user experience, even if a very limited one, around the images release is testament to that fact. And of course I'm talking about even in the context of the foundation model labs, given that there are already so many of what used to pejoratively be called wrapper companies that have become extremely successful by focusing on specific interfaces for specific industries and use cases. One particular interface that I think that we're likely to see is what I'm calling a notebook LM for agent building, by which I really mean a really simple studio type of interface for building agents. I fundamentally don't believe that the drag and drop automation type builders that you see with products like Zapier and Lindy, as powerful as they are and as useful for power users as they are, are going to be an interface that takes building agents to the mainstream. Now, NotebookLM might not exactly be the right analogy. I just mean we're gonna have distinct experiences, I think, for building agents, some of which come from the major labs themselves. Google is a good bet to deliver this. First, given that Google AI Studio is kind of already inching towards this in a number of ways still in the models and capabilities section, I believe that the focus on coding that we saw throughout 2025 not only won't decrease, it will radically ratchet up. It is both a massive use case but also a capability set that unlocks lots of other use cases. And you better believe it is going to be very much on the minds of every single lab with every single model they release. My next prediction is that we're going to learn in 2026 just how valuable it is to have last mile end user data that can help refine your models. SWIX has framed one of the competitive battles which I'll talk about in the competition section as the Agent Labs versus the model labs. The agent labs of course, are things like cognition and cursor, whereas the model labs OpenAI, Anthropic, etc. At the very end of 2025 we started to see the agent labs moving into the model space, taking advantage of the fact that they have a set of data that the model labs don't necessarily because of how much of that end usage they have. Will that actually allow them to jump out ahead and become the next generation model Labs? I don't know that that'll be decided in 2026, but we're certainly going to have a lot more information about it. Another prediction around where I think the labs are going to focus on memory feels to me like just obviously the biggest opportunity in some ways already. The very nascent type of memory that we have in these LLMs at the end of 2025 as opposed to, for example, the end of 2024 has made a major difference. Likewise, already that limited memory is maybe the biggest barrier preventing people from model switching. I'm about as voracious a model switcher as they come with the top level of every subscription across all of the major models. And yet, despite the fact that I try most use cases across most models, there are certain things where the memory that one of the models has about a particular area of business or previous conversations I've had just means it's too much of a pain to transfer from one to the other. Now, this is not a particularly difficult prediction. It's something, for example, that Sam Altman is already talking lots about. But I do think it's going to be an increasingly important focus, especially if and as the other models start to catch up with ChatGPT and they're looking for better ways to lock users in. One that you might have heard me talk about a little bit in my review of the A16Z Big Ideas is my thoughts on world models. I think that this is going to continue to be an area that people are really excited about. I think we're going to see some new entrants to the market. Yann Lecun, for example, left Meta and is purportedly raising a half billion dollars at a big valuation to go pursue this opportunity. But I think that in 2026 specifically we are going to continue to get really cool demos and maybe some really early sandboxes, but I don't think that we're going to have a generalist usability type of moment yet. Right now, world models feel a little bit to me like the VR of the AI world, where it's not hard to understand how powerful they could be in theory, but because they represent some totally new capability set for experiences and are not just a one to one replacement for things we used to do, there's just going to be a lot more time to shift that type of behavior now. World models are valuable for more reasons than just the end user. Obviously many people think that they are a better path to AGI than the approaches we're currently taking. So in that way they're not like VR as some new consumer category. But I still think that when it comes to their maturity, I'd be surprised if we were all using some major model by the end of 2026. I would of course be delighted to be wrong on this one. Lastly, in the models and capabilities section, I think that in 2026 we're going to see the lines between assistants and agents get more blurry, not more clear. What I mean by that is that I think that the way that agents will start to make their way into the real world on a wider array of use cases is still going to be through individual users delegating more to them. I think that users shifting and using agents to manage more complex tasks, like for example, taking this outline and turning it into a 56 slide presentation is going to be the way that agentic AI starts to proliferate, particularly in the enterprise. Now, this is not to say that we also won't see lots of progress on fully autonomous agents, but I think in practice it's more likely that 2026 is the year of agent managers than is the year of full autonomy. Sure, there's hype about AI, but KPMG is turning AI potential into business value. They've embedded AI in agents across their entire enterprise to boost efficiency, improve quality, and create better experiences for clients and employees. KPMG has done it themselves. Now they can help you do the same. Discover how their journey can accelerate yours@www.kpmg.usagents. that's www.kpmg.usagents this episode is brought to you by Blitzy, the Enterprise autonomous software development platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzi platform, bringing in their development requirements. The Blitzi platform provides a plan, then generates and pre compiles code for each task. Blitzi delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzi as their pre IDE development tool, pairing it with their coding pilot of choice. To bring an AI native SDLC into their org, visit blitzi.com and press get a demo to learn how Blitzy transforms your SDLC from AI Assisted to AI native. Today's episode is brought to you by my company, Superintelligent. Superintelligent is an AI planning platform, and right now, as we head into 2026, the big theme that we're seeing among the enterprises that we work with is a real determination to make 2026 a year of scaled AI deployments, not just more pilots and experiments. However, many of our partners are stuck on some AI plateau. It might be issues of governance, it might be issues of data readiness, it might be issues of process mapping. Whatever the case, we're launching a new type of assessment called Plateau Breaker that, as you probably guess from that name, is about breaking through AI plateaus. We'll deploy voice agents to collect information and diagnose what the real bottlenecks are that are keeping you on that plateau. From there, we put together a blueprint and an action plan that helps you move right through that plateau into full scale deployment and real roi. If you're interested in learning more about Plateau Breaker, shoot us a note. ContactSuper AI with Plateau in the subject line Small, nimble teams beat bloated consulting every time. Robots and Pencils partners with organizations on intelligent cloud native systems powered by AI. They cover human needs, design AI solutions, and cut through complexity to deliver meaningful impact without the layers of bureaucracy. As an AWS certified partner, Robots and Pencils combines the reach of a large firm with the focus of a trusted partner. With teams across the us, Canada, Europe and Latin America, clients gain local expertise and global scale. As AI evolves, they ensure you keep peace with change, and that means faster results, measurable outcomes, and a partnership built to last. The right partner makes progress inevitable. Partner with Robots and pencils at robotsandpencils.com aidaily Brief. Next up, let's talk Vibe coding. And by the way, I've decided now that I've gone through a full section jumping back and forth between Manus and genspark that I just like the genspark better. In this case. Manus did a great job as well, but it's got a little bit too much of that obvious Nano Banana Pro sheen for our purposes here. So next section, Vibe coding obviously one of the biggest themes of 2025. So how do I think it's going to change next year? First of all, I think we're going to see a big bifurcation. Right now we use the same words to describe two totally different things. Vive coding or AI and agentic coding within software engineering organizations and Vibe coding among non developers. These are wildly different things and I think that we'll stop treating them as the same thing. Now moving into what that's going to mean, I think that on the engineering side we came into 2025 with there still being a ton of resistance, especially among enterprise engineering departments, to AI and agent decoding. By the end of the year we've shifted all the way to the conversations being about how to best handle and organize different types of autonomy, how to manage the new challenges that AI and agentic coding create, where and in what ways. Organizations think they need to ignore certain capabilities so their own capabilities don't atrophy. But all of it, I think amounts to a big reorganization of engineering organizations to take advantage of AI enabled coding. Now this might seem obvious and for those of you who are in startups or who live deep in the AI industry, this has probably just been happening continuously throughout the year. But I think you're going to see it start to jump into even traditional organizations that are really going to have to reevaluate how they're structured, how they deliver, how they deploy. Next up, and one of the predictions that I feel most strongly about, Vibe coding is going to move beyond prototypes into production mode in non tech areas of the enterprise. That could be things like custom legal, contract analyzers, onboarding apps for hr. I think you're going to see a ton of vive coded experiences enter the marketing world and of course these things may never touch the engineering organization. You might still have engineering departments that make sure these things don't introduce new security risks or are production ready if they're public facing. But I think we're going to see production mode vibe coding enter all the non tech areas of the enterprise this year on the consumer side, I think we are going to see a lot of bespoke personal software. Some people have called this ephemeral software. I don't think that the terminology is exactly figured out yet, but the idea here is basically people building themselves tools because it's easier to chat with lovable or replit or whatever they're using and get a thing that is exactly tailored to them than it is to go find and tweak some existing app experience. Or maybe that thing just doesn't exist. For example, right now I have a gift tracker that I was using to keep track of what we had got for our kids so we don't end up getting way too much, as always happens with me, which is an example of something that just doesn't exist right now. Or honestly, I didn't even really look to find to see if it did, because I knew exactly what I wanted and it was easier to just build it. And I also built myself a simple fitness tracker. Now I've tried like every different fitness tracker and it's not that they didn't have the features that I was looking for. I just wanted something very specific that made sense to my particular brain and it was easy enough to just build for myself. Anyways, I think we're gonna start to see a lot more of this personal software start to happen this year. I've been vibe coding all year, and it's only just in the last month or so that I felt myself start to naturally ask could I solve that with software? I think probably a growing number of people will start to have a similar experience and that'll lead us in some really interesting places. One of the places I think that'll lead is we'll probably see a new class of AI app entrepreneur. Some number of these things that start off as people building for themselves, they'll probably figure out have kind of a market, and since they never needed to raise venture capital or anything like that, the economics of these things look totally different. Maybe, for example, you don't care about subscription costs and you think people would be happier paying 10 bucks one time than having to think about $2 a month for perpetuity. The other thing that makes this one interesting is of course, ChatGPT becoming something of an app platform, although I don't think we have any idea yet exactly how that's going to play out and whether there will actually even be a way for independent and smaller developers to actually find their way into that flow, or if it's just going to be dominated by the major partners Another really hyper specific prediction. I think it is going to be a very tough time for template based website creation software once you have used English to manage your personal website and when you want something changed, you can just explain it. You are never going back to templates. Now of course Wix and Squarespace are both aware of this. Wix bought base 44 and is heavily investing in this area. So it's not a knock on the companies themselves, but I think this mode of building personal websites is on its very, very last legs. One more super specific one. I think Shopify potentially has a uniquely important role in the AI ecosystem. Shopify is already how so many people, small creators, small builders, people who don't consider themselves technical at all, interface with E commerce and increasingly just interface with the entire spectrum of their online business. It's not just their store, it's also their website. Shopify has been extremely attuned to the AI opportunity and and I think because they serve such a normie audience who is definitionally not necessarily tech savvy, they have a really important role in transmitting and helping share the value that AI can bring not just to tech people, but to regular people who are just trying to run their businesses more effectively. Speaking of businesses, let's move over to the enterprise world, starting with the section on enterprises and vibe coding. Overall, I think we are going to see a knowledge work vibe ification, which basically means we're going to see what happened with software engineering this year, go into all other areas of knowledge work next year. Simply put, we're going to start to make a shift from doing to managing. This entire presentation is a great example of that. Now I think that this is a 5 to 10 year megatrend and so I don't want to overstate how dramatically the shift will happen, but I think it will feel distinct even inside big lumbering, boring old organizations. I also think we are going to see new vibe coding specific roles. Basically, I think companies are going to start hiring people who have an overlap of some particular functional experience and also are good vibe coders. Think of them as internal forward deployed Vibers. Now Lenny Richicki recently called this out saying that he had seen some of this happening. So maybe I'm cheating by making a prediction, but I definitely think that this is going to be a thing that more and more enterprises hire for in 2026. And the forward deployed Vibers will of course help all the different departments and functions figure out how to use coding in ways that they couldn't before. Now I talked about personal software but will companies build their own version of personal software, basically replacement software for their big enterprise software deals? Klarna very famously a couple years ago scrapped Workday and Salesforce and shifted to their own. And I've always been quite skeptical that that's something that companies are going to do en masse. So here's the nuance. I actually do think that in 2026 we are going to see companies build replacement software, but I don't think it's going to be massive companies ripping out Salesforce. I think this is going to impact small and medium sized companies. The companies who, if you checked out the AI ROI benchmarking study, are operating a little bit more nimbly and already seeing more value from AI because they can take full advantage of it more quickly. I think you're going to see those types of companies who those big lumbering enterprise sales contracts were never necessarily a great fit for, increasingly not only not work with the salesforces of the world, but also have a pretty high bar for even the long tail software providers, like in the case of CRM, a HubSpot or something like that. I think more and more you are going to see people who don't have use for 70 or 80% of the features just build the 20% that they want, especially if it's internal facing and it can be a little clunky and broken. Now this won't be ubiquitous and of course the SaaS providers are doing a lot to integrate AI features to make their products better. But I do think we are going to increasingly see companies build replacement software, particularly in areas like CRM. Now moving to enterprises more broadly, to the shock of no. 1. I think that there is going to be a huge ROI and benchmarking focus. Call 2026 the year of the dashboard. Now. It's not that I think that companies will stop doing AI if they can't get precise measures of roi, but I do think that they're going to start trying to measure things in a much more distinct and discreet way. In fact, I think it's kind of going to be the wild west of measurement this year until we actually get some benchmarks under our belt. People are going to explore all sorts of different types of impact metrics and different ways of determining value. But I would expect it to be way more quantitative than qualitative heading into 2027 than it is heading into 2026. I also think that there is going to be a ton of focus on data and context engineering. I think investing in your AI and agent infrastructure is going to be sexy in the enterprise in 2026, companies are going to realize that to really get full value, especially out of agents, they're just going to have to take the time and make the investment to have their data available to work for those agents. Now, they've known this for a while, but I think it'll really come to the fore and be something that people talk about and focus on, even to the exclusion of some random test agents in the year to come. Now, the next one is kind of an echo of what we talked about before with NotebookLM for agents, but I think that for enterprises to shift more of their behavior into the agent realm, in other words, out of the realm of assisted AI and automated workflows, it's going to take some serious interface improvements. Again, enterprises are not going to use zapier style builders, but I think that as we do get those new interfaces, a lot of opportunity will unlock. In fact, I kind of think that we're going to start to see a bit of a squeeze on that workflow automation this year. One of the things that's happening right now is that a lot of enterprises, and this makes sense, are trying to use AI to map how their humans currently do things to allow agents or realistically automated workflows copy that human process. In many cases, there could be a ton of value there. However, I think that it is highly likely that the real destiny will be total process reinvention based on new agentic capability, not just an agent copying what a human did. Agents are not humans. They work in different ways to get the full value out of them. It's in most cases we'll probably need to figure out or allow them to figure out the best way of accomplishing a goal without imposing an existing process on them. So I think you're going to start to see a squeeze on automation from both. Just the assisted AI on the one hand, which is going to continue to be a huge part of personal productivity gains that then can translate up into the organization and actual new agentic processes from the other side that start to redefine how a workflow can work. Finally, in the enterprise, I think we are going to start to see the full impact of AI compounding. We're now at a point where the organizations that are leading are going to start to get farther and farther ahead, not just on their AI usage, but I think that their AI usage will actually start to open up. Not just efficiency gains in what they do now, but new opportunities such as new product and revenue lines as they do that the distance between them and the AI Laggards is going to do nothing but grow. For now, that is going to do it. For today's episode. Appreciate you listening or watching as always. Until next time, peace.
