Transcript
A (0:00)
Today we are looking at 51 charts that tell the story of artificial intelligence heading into next year. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG Super Intelligent Robots and Pencils and Blitzy. To get an ad free version of the show, go to patreon.com aidaily brief or you can subscribe on Apple Podcasts. If you are interested in learning about sponsoring the show, send us a note at sponsorsidailybrief AI and if you want to learn more about our recently released AI ROI Benchmarking survey, you can find out all about that@aidbintel.com now we are in the midst of end of year episodes, which is a combination of course, of both looking back and looking forward. And this episode is all about the charts that sit right at that intersection. There are charts that tell us where AI is today and give us some idea of what we should be planning on heading into 2026. Given that there are 51 of these things, I am going to rip through them. So buckle up and let's talk about the 51 charts that explain AI in 2026. Quick note on the production of this the charts were all sourced entirely by me. Part of my process for preparing this show is spending a ton of time on X slash Twitter and using those bookmarks heavily. And I have a folder where I actually keep these types of charts. So step one was just going back and looking at the charts that I thought were most reflective of the current moment and had something to say about the year we're heading into. The second part of the process was outlining a somewhat rough organization of which charts I wanted to include. From there I turned it over to Claude, ChatGPT and Gemini to see how they would organize it. I like Topus 4.5 best, so we went with that with a few tweaks and then I handed that and the charts off to genspark and Manus to put it all together. And while genspark looked much better, it made some really weird leaps in terms of how it was describing things and had some errors. So ultimately we went with Manus, which was then exported to Google Drive for a final edit by me. Apologies for those of you who don't care about that. I just think a lot of you are also interested in the operator and production side of AI, so I like telling you how these things get put together. All right, as you can see, we've divided this into seven categories Capabilities, Infrastructure, Markets, Economics, Vibe, Coding, Jobs and Politics. We kick off with capabilities. First chart comes from Open Router and is the Reasoning vs non Reasoning token Trends over Time. You've probably seen this one a couple times now. Basically at the beginning of 2025, reasoning models were not yet really a thing. OpenAI had announced 01 Preview back in September and it had finally become available at the very end of December. But we were just starting to get our hands on these things that would change dramatically over the course of the last year. And By November of 2025, reasoning tokens represented meaningfully over 50%. This has brought with it new capabilities, new use cases and new ways of thinking about how we scale. Our next chart is the one that for much of this year held up the entire world. It felt like this is the chart from Meter that measures the time horizon of software engineering tasks that different LLMs can complete at 50 and 80% success rates. So the task duration here is not how long the model works for independently, it's how long in human equivalent time a task can complete. Coming into this year Meter had shown a doubling of capability roughly every seven months, but it had started to inch up to closer to four months and this year reified that four month doubling time. In these charts you can see the 7 month doubling line in green and the 5 month doubling line in red. And you can see how at 50% it hues really closely to the four month line. And at 80% it's mostly on the four month line with a few recent ones in between the four and the seven month line. Now whether it's seven months or four months, the point is capabilities have not plateaued. They continue to increase dramatically and quickly. We are also seeing major efficiency gains. This chart shows the performance efficiency of Gemini 3 Flash, which is better performing than Gemini 2.5 Pro, which was state of the art just a few months ago for around a third of the cost. Especially as we move into a world where production workloads are getting bigger and bigger and we are consuming more tokens, the fact that it's not just capabilities but also efficiency and costs that are improving is a big deal. Another measure of the efficiency gains came with 5.2's performance on the ARC AGI1 exam. The ARC AGI benchmark folks noted that between a tweaked O3 model last year and GPT 5.2 this year, there was a 390% efficiency gain in a single year. Now what this all adds up to in terms of when we get AGI is kind of anyone's guess. As you can see from this chart, people are all over the place in terms of when they think we're actually going to get AGI. By the way, there's no common definition of AGI and there are even plenty of folks out there who think that the term is getting more and more meaningless. One interesting note is that I think that if anything, people's timelines actually got moved back slightly heading into 2026 from where they were heading into 2025. Despite all these capability gains, Andrej Karpathy in particular in a big interview he did, might have single handedly set back the timeline a couple of years. Now, as we look at these new releases, they are not just incremental. In many cases they are solving key challenges. The charts we have here are for a long context test that basically tests how an LLM's performance degrades the more context you give it. With GPT5.1, which is like a month old at this point, you can see that performance on this test went from around 85 or 90% at 8k tokens to a little under 50% at 256k tokens, whereas with GBT 5.2 thinking it was at 100% to start and stayed very close all the way to the end. This makes that context window actually usable in a way that it just wasn't before, which is extremely powerful and opens up new use cases. Still, AI capabilities, as much as they are evolving, are not evolving evenly. There are a bunch of different versions of this chart that have different size of spikes depending on how good you think AI is. But the idea here is that AI progress is not uniform. It is instead jagged, where a model can be superhuman at certain tasks and unbelievably incompetent at basic things that a kid could do. This jaggedness is a key facet of AI and is part of the challenge in implementing it well. Indeed, when it comes to what slows down AI, there are a set of three different bottlenecks. This chart comes from a recent essay by Professor Ethan Mollick, who organized it into capability bottlenecks, process bottlenecks, and verification bottlenecks. Capability bottlenecks are the ones we think about most, the weaknesses in AI, and that jagged performance process bottlenecks are the ones that we really started to reconcile with this year. The things that make it hard to overlay AI onto existing systems, particularly in the enterprise, and have it do what it's able to do. A third layer which we Talk about not very much is a new category that is native and endemic to AI, which is verification bottlenecks, basically where humans become crucial to reviewing edge cases and ensuring final accuracy, which is often a whole new set of processes that humans need to be organized around. I feel like in some ways, the first profession to really reconcile with these verification bottlenecks is software engineering, which has seen such a shift over the course of this year in how AI and agentic coding supports what they do, but has created all these new challenges and shifted a lot of the work to that verification step. Regardless of the challenges, one thing that this year showed is a massive explosion of diversity in the model set. The major labs are putting out more different types of models that have different strengths and weaknesses and are optimized for different types of use cases. But Chinese labs have also exploded and become a major player when it comes to the choices that builders have access to. Next up, we move to infrastructure. Obviously, the big theme of this year, which will continue to dominate heading into next year, is the hyperscalers making just historically large capital investments into AI infrastructure in the form of data centers. This represents one of the largest coordinated technology investments in history, and something that the market has really had to reconcile and wrap its head around. The level of investment is why people are asking questions about whether the output of AI and the the revenue that comes from that can possibly justify it. And yet all of the big labs feel exactly the same, which is, as Mark Zuckerberg has articulated many times this year, it is a much greater risk to underinvest than to overinvest. Another expression that we got that tells the story of the moment in such a simple chart is the capital going into office construction versus data center construction. This was actually from a couple months ago, and I'm sure that it is actually fully flipped at this point. But starting in 2023, you see a shift where less capital is going into offices and more capital is going into data centers. And sometime in the middle to the late part of 2025, those lines actually overlapped, where we're now seeing more money spent on data center construction than on office construction. Now, one of the things that some people ask is how much all of this new infrastructure really matters. This chart shows that slower growth in compute could lead to substantial delays of possibly years in terms of certain capability milestones. Now, it's a whole separate question as to whether there are actually benefits to those delays. For example, if you ask Bernie Sanders, there absolutely are. But the point that this chart is trying to make is that there will be consequences to the speed of AI development if these labs don't have access to the compute that they're looking for. And this chart shows that as much as the labs have to service their existing customers, they are still heavily investing in the future. Now, this is 2024 and I'd be interested to see an update for 2025. But in 24 you can see OpenAI's R&D compute was 5 billion, as opposed to its inference compute which was at about 2 billion. I'm not sure that they were able to maintain this ratio this year given the release of their images model, which was their most viral moment of the year. Plus the release of Sora plus just continued growth in their base usage. You have to think that servicing their existing customers started to compete with R and D a little bit more this year in ways that could be challenging heading into the future. Certainly there is scuttlebutt going around that some folks inside OpenAI aren't particularly happy about how that ratio looks right now. From there, we move into markets. The first chart to note is sort of the most obvious. That chatbot adoption is absolutely massive and faster than anything else we've ever seen before. I hardly need to spend a lot of time on this, but we now have two chatbots in ChatGPT and Gemini that are absolutely careening towards a billion active users. Something that it took the previous fastest growing technologies five plus years at the very minimum to achieve. Still, if there was one chart that defined AI for markets this year, it's all of the various permutations of this circularity chart. This shows how much revenue and deal making flows between the major players like Microsoft, OpenAI and Oracle. Now to some, this chart is exhibit A in why AI is a house of cards. But of course, what this chart is missing is a visualization of the very significant and real revenue that is also coming in to help fuel this. Of course, the revenue isn't even close to the total scale of dealmaking right now, but it is growing faster than anything we've ever seen and we really have barely started to scratch the surface on monetization. Still, this chart is as good a Rorschach test as we have for how you feel about the markets heading into the next year. Paired with this we have a recent chart of OpenAI's estimated balance sheet that shows just how much external capital they're going to need to get to the point where they're actually profitable now. So far, it doesn't seem like they're going to have any problem accessing that capital. The most recent rumors are that they are raising tens of billions of dollars, if not up to $100 billion at an $830 billion valuation, suggesting that capital markets are very comfortable with what they're seeing from OpenAI and very willing to fund this party to keep going. Now, for those who are AI bears, one of the things that they are concerned about is the reduction in inference costs, where as models get good enough, AI products can actually run on much cheaper, simpler GPUs and computers. They worry that if that happens, all of this massive investment in complex architectures doesn't really make sense anymore. One investor I saw called this chart the most important and misunderstood chart in AI. Now, it should be noted that not everyone agrees with this and there are lots of counterarguments, but if we're trying to understand where the market's head is at heading into next year, this is a key consideration. Now we move into some model competition, Anthropic by any measure, had a very good year. Their market share in coding was massive and that dragged their market share across the enterprise up as well. According to Menlo Ventures estimates, they now claim 40% of the enterprise market ahead of OpenAI. Google also saw a lot of growth in enterprise this year as well. And as fast as OpenAI's revenue has been growing, Anthropics has been growing even faster. They went from $1 billion annualized at the beginning of this year and will end the year at somewhere around 8 or 9 billion, it seems. OpenAI started around 4 billion and will end the year at 13 or 14 billion, which is absolutely incredible, but still growing more slowly than Anthropic, at least for the moment. Still, if you are just taking a step back and don't have a particular horse in this race, the thing to note is just the incredible pace of revenue growth for both of these companies, which has to be bullish for their ability to actually make good on all these big deals that they're signing over the course of the next five years. Another key story of 2025 that sets up the battle for 2026 is of course the the massive resurgence of Google. You can see this moment around the launch of GPT5, which also happened to be the launch of the first version of nanobanana that Gemini really starts to take off. The release of Gemini 3 has also purportedly even increased this competition, and Gemini is absolutely surging heading into next year. Another way that you can see this expressed is in the betting markets, where the likelihood that Alphabet is the largest company by the end of next June has increased significantly. Nvidia held a commanding percentage of that at the beginning of the year, and Alphabet is now creeping up on them. Now the other way to see the sentiment shift between OpenAI and Alphabet is in the basket of correlated stocks. Bloomberg and Morgan Stanley put together a basket of stocks that are exposed to Alphabet and a basket of stocks that are exposed to OpenAI and showed how around November they started diverging, with the Alphabet exposed stocks continuing to rise and the OpenAI exposed stocks taking a bit of a hit. Now, this doesn't mean anything fundamental, it just means a shift in what markets believe, but it's a pretty clear and dramatic signal of where things are. Still, if you are one who is just looking at the performance of these different models, I think one of the most powerful charts is this one, which shows that no one stays on top for long. OpenAI introduces the world's most powerful model, followed by Anthropic, who introduces the world's most powerful model, followed by Gemini, who introduces the world's most powerful model, followed by Grok, who introduces the world's most powerful model on and on Forever Infinity. Just from a performance capability standpoint, this is absolutely the chart that best shows what we are going to experience throughout 2026. I strongly believe. However, if we're talking about the competition between the labs, we do have to give China its due. This chart shows the massive increase in China as a share of open source tokens. At the beginning of the year it was all meta and mistral with almost nothing coming from China. By the end of the year it was something like 80% Chinese models. They are a factor heading into next year and will be a bigger factor throughout 2026. Hello friends. If you've been enjoying what we've been discussing on the show, you'll want to check out another podcast that I have had the privilege to host, which is called you can with AI from kpmg. Season one was designed to be a set of real stories from real leaders making AI work in their organizations. And now season two is coming and we're back with even bigger conversations. This show is entirely focused on what it's like to actually drive AI change inside your enterprise and as case studies, expert panels, and a lot more practical goodness that I hope will be extremely valuable for you as the listener. Search you can with AI on Apple, Spotify or YouTube and subscribe today. Today's episode is brought to you by my company, Superintelligent. Superintelligent is an AI planning platform. And right now, as we head into 2026, the big theme that we're seeing among the enterprises that we work with is a real determination to make 2026 a year of scaled AI deployments, not just more pilots and experiments. However, many of our partners are stuck on some AI plateau. It might be issues of governance, it might be issues of data readiness, it might be issues of process mapping. Whatever the case, we're launching a new type of assessment called Plateau Breaker that, as you probably guessed from that name, is about breaking through AI plateaus. We'll deploy voice agents to collect information and diagnose what the real bottlenecks are that are keeping you on that plateau. From there, we put together a blueprint and an action plan that helps you move right through that plateau into full scale deployment and real roi. If you're interested in learning more about Plateau Breaker, shoot us a note. ContactSuper AI with Plateau in the subject line, AI changes fast. You need a partner built for the long game. Robots and pencils work side by side with organizations to turn AI ambition into real human impact. As an AWS Certified Partner, they modernize infrastructure, design cloud native systems and apply AI to create business value. And their partnerships don't end at launch as AI changes robots and pencils stays by your side so you keep pace. The difference is close partnership that builds value and compounds over time. Plus with delivery centers across the us, Canada, Europe and Latin America, clients get local expertise and global scale. For AI that delivers progress, not promises, Visit robots and pencils.com aidaily Brief this episode is brought to you by Blitzy, the Enterprise autonomous software development platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzi platform, bringing in their development requirements. The Blitzi platform provides a plan, then generates and pre compiles code for each task. Blitzi delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the Sprint. Public Companies are achieving a 5x engineering velocity increase when incorporating Blitzi as their pre IDE development tool, pairing it with their coding pilot of choice. To bring an AI native SDLC into their org, visit blitzi.com and press get a demo to learn how Blitzy transforms your SDLC from AI Assisted to AI native. Next up, we move over to economics, by which we mean the impact of AI. We've talked already a little bit about how the Cost of using AI has fallen precipitously. What's interesting, of course, is that Javon's paradox has fully locked in and the total amount that enterprises are spending on AI has gone nothing but up. In fact, price decreases are fueling usage growth as price decreases unlock new types of use cases that were uneconomical before. This has led to enterprise AI being the fastest scaling software category in history. According to Menlo's estimates, it now captures 6% of the $300 billion global SaaS market. Now, it should be noted that I don't think that the total addressable market for AI is the $300 billion global SaaS market. I think it is multiples larger than that. But still, even in historically slow moving and lumbering enterprise adoption, things are moving very, very quickly. What's more, despite some rumors to the contrary, companies are actually seeing measurable ROI from AI even now. In a Wharton study of something like 800 executives, around 75% reported positive ROI from their AI investments. On our AI ROI benchmarking study, we found 82% saw current positive ROI. Of the remainder, by the way, only 5 1/2% were currently at negative ROI, and even those that were at negative ROI anticipated that becoming ROI positive by next year. In fact, 96% of overall respondents anticipated positive ROI within the next 12 months. And across the entire sample, 37%, almost 4 in 10 were already seeing high ROI from their use cases reporting either significant or transformational impact. In another chart from our ROI study, we also found a correlation between how diverse an organization's use of AI was and how much benefit they got. We organized impact into eight different benefit categories, and we found that when an organization had use cases with just one benefit type, their ROI was lower than organizations that had four different benefit types who were lower than organizations who had all eight different benefit types in a pretty significant way. Three was a measure of modest roi. Four was a measure of significant roi. Organizations that had one benefit type were just over the edge of modest at 3.13, and organizations that had eight benefit types were starting to creep on significant at 3.65. What about the idea of 2025 as the year of agents? Well, it turns out in practice, agents remain nascent. In that same MENLO study, they found about 10 times as much money being spent on assistants and copilots than were being spent on agents. So far in our ROI study, we found something similar. We divided use cases into three categories, assisted, automated, and agentic, and found 57% were in the assisted category. 30% were in the automation category where the AI was managing a discrete workflow, and 14% were in that agentic category of autonomous work execution. One more in the economic section that is under discussed but I think is going to be extremely important next year. I anticipate that we're going to start seeing a much deeper integration of ads into the AI landscape. There are a variety of reasons for that, but they are not just about the business model needs of the labs, although that's a part of it. LLMs also appear to be a really good platform for ads. Check out these recent numbers from similarweb. They looked at the average minutes spent on site after referral, the average page views on site after referral, and the average conversion rate of referrals. If the source was ChatGPT versus Google and in each case ChatGPT absolutely thwumped Google. The average minutes spent on site were three times higher from referrals from ChatGPT, average page views were 25% higher and the conversion rate jumped from 5% to 7%. Basically, people who are finding sites through LLMs are more high intent, it appears, than your average Google browser, which again makes it a great place for sponsored links and ads. Next up, let's look at a category that was extremely important to 2025 vibe coding our first chart is just Vibe coding grew really fast. We saw multiple companies surge into the nine figures of revenue. Some, like Cursor, creep up on $1 billion in ARR and some, like Claude Code, blow past that. The combination of meaningful token cost and high consumption plus implications for other use cases in LLMs made coding related performance become the industry's number one priority. As we head into next year. However, especially engineering organizations are trying to figure out how to redesign themselves around AI coding. A chart that I've seen from swix, Sean Wang and others is this semi async valley of death chart. It looks at agent autonomy and measures the experience or observed productivity at various autonomy levels. On the one end of the spectrum, when coding agents are extremely responsive in very fast order, they can be extremely valuable for deep work. Focus on the hardest problems. On the other end of the spectrum, when they have a lot of autonomy, they can be great for simpler texts that are handled in the background. The challenge is in the middle range. Whereas the chart puts it it's not enough to delegate and it's not fun to wait. Now different organizations are handling this differently and I think SWIX even has some questions around whether this is exactly the right way to think about things but for our purposes, this chart represents not just the semi async valley of death, but just the broader set of questions that engineering organizations are going through heading into next year to redesign themselves around AI coding. The reason that this matters outside of software engineering is that I believe that they are the first department that will fully reorganize themselves around AI capabilities and in so doing set a template that other departments and functions can start to follow. Another interesting chart showing the impact of Vibe coding After a long period of being flat in 2024 and 2025, we started to see the number of apps and games released to the App Store. Going back up the jump in 2025 was particularly acute, going up 25% in a year. Some are attributing this to the rise of Vibe coding, and I think that that's right. Now for our last two sections we move into the society level issues and the charts that will shape some of the big debates that we're about to have. This chart has been absolutely everywhere. The idea of a K shaped economy where stocks and asset owners are doing great and everyone else is doing not so great has become fairly standard belief at this point and there are many who want to attribute it to the launch of ChatGPT. Now there are a ton of other factors like the rate hiking cycle and the return to the mean after post Covid over hiring, but when it comes to politics and society level conversations, narratives can often matter more than nuance. And there are some parts of the economic challenge for people, whether attributable to AI or not, that are undeniable. For example, we have the highest youth unemployment rate we've had since since about 2015. If you don't take into account the COVID spike. What's more, to the extent that we are seeing patterns that are maybe actually attributable to AI, it does look like early career folks are being hit the hardest. This is a chart of the headcount over time organized by different career sectors and you could see towards the end of 2022 there's a divergence between the mid and senior career folks with early career really falling off to the extent that AI is taking on all of the junior tasks, there is going to be a really interesting challenge for us around how people bridge from their early career to their mid career. Now some folks are starting to think about where the job disruption is likely to come. There were about a million studies this year that were less studies and more predictions of which types of jobs are going to be most subject to disruption. One really valuable chart came out of Stanford who divided tasks and roles based on where workers desired automation and based on where AI was actually capable of automation. Roles and tasks where workers desired automation and where AI was capable is what they called the green light zone. Tasks where automation desire was high but automation capability was low they called the R and D opportunity zone. Tasks where capability was high but desire was low is what they called the red light zone. And unfortunately some others looked there and found that a lot of, for example Y Combinator startups were working in that red light zone. Which I think more than anything reflects just the fact that we need to be having these conversations more around where we actually want automation. Now, as narratives take hold of AI labor disruption, some studies are also pointing out that counterfactually, that's not necessarily the only thing that's showing up. A recent study, for example, showed that in terms of both wage growth and overall job growth, occupations with high AI exposure, at least right now, are growing much more significantly than those with low exposure. All of which sets us up for the politics conversation Merriam Webster's word of the year was slop. And I had tweeted that I think that it tells you all you need to know about the difference of our perspective inside the AI industry than outside. That it was slop and not something like vibe coding. That was the word of the year. And yet, as much as it seems like AI is going to become an issue, at least for right now, most folks don't rate it super highly as something they care about. Only 7% of people polled had AI in their top five most important issues. That said, they definitely don't want companies to have a free hand. Recent polling around the White House executive order to ban state level regulation had pretty strong opposition, 55% opposed to just 18% supporting it, with 27% not sure. And while broadly the issue may not be clear, data center politics are starting to emerge as a local issue. It's still very nascent, but in a couple of elections we saw this year it was a meaningful part of the discourse. I would expect to see a lot more of that heading into the midterms next year. So there you have it my friends. 51 charts that explain AI heading into 2026. Hopefully this was an interesting lens to look at things through. I will have a link to this presentation if you want to download it on aidaily. Brief AI for now. That is going to do it for today's episode. Appreciate you watching or listening and until next time, peace.
