Transcript
A (0:00)
Today on the AI Daily Brief, why everyone in AI is talking about context graphs and before that in the headlines, AI wearables appear to be coming back. But will anyone actually care this time? The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, kpmg, zencoder and Superintelligent. To get an ad free version of the show go to patreon.com aidaily Brief if you are interested in sponsoring the show, send us a note at sponsorsdailybrief. AI and I wanted to give a quick update on the AIDB New Year. Your New Year's AI Resolution. I dropped an episode on New Year's Eve that was basically a 10 week, self guided project based way to upgrade your AI skills across a bunch of different dimensions. Turns out lots and lots of you want to upgrade your AI skills across all these different dimensions. We're now up to 1900 people who are participating and over the weekend we added a new team feature where you can actually sign up as a group. It's super lightweight. Basically it's just an individual link that's going to be able to be shareable with your friends, family or colleagues and which will show you the gallery of projects specifically from your team members in its own section. Now I will also point out that for fun each week I'm also going to shout out the most active teams on the podcast. So if you want your team to be recognized for being super AI forward, well get on here, recruit your colleagues and go to town on these projects. All in all, it's incredibly fun to see so many of you guys responding to this and diving into 2026 with such gusto. Again, you can find all of that information on aidbnewyear.com with that out of the way, let's get into today's episode. Welcome back to the AI Daily Brief Headlines edition. All the daily AI news you need in around five minutes. Today we're kicking off with a theme that was frankly such a flop in 2025 that I didn't even really think to put it in any of my predictions for 2026. I'm talking of course about AI wearables. In 2025 we got the Humane AI pin, which marques Brownlee called the worst product he's ever reviewed. We also got the rabbit R1, which while technically still around, has certainly failed to make any sort of dent. Then there were things like the limitless Pendant which is gone now that Limitless was acquired by Meta. And of course Friend, which has been mostly notable for the anti AI vandalism all over its subway ads in New York City. Still, this week's CES in Vegas sees the launch of multiple new AI wearables that all seek in some way to improve on the original formula. One of those is Plaude, who is releasing their updated Notepin S. The hardware is essentially the same as the original Note Pin, which was released in 2024. It records audio and provides users with AI transcription. The only major change is the addition of a button to replace haptic controls. Instead of squeezing the device to begin a recording, users can now simply press a button to reduce the risk of failing to record. Users can also use the button to flag important moments in the recording. And while that may seem like a small change, frankly a lot of the problems with the previous generation were not always that people were unwilling to try them, but that they didn't work all that well when they did, writes the Verge. AI recorders like this live or die by ease of use, so removing a little friction gives Plod better odds of survival. In addition to the new device, Plod is also launching a desktop app for meeting recordings, allowing users to have one ecosystem for all their note taking needs. The new device is slotted in at $179, which is a $20 price bump from their original Note Pin, which will be phased out of production. Home automation startup Switchbot is also stepping into the AI note taking space with their mind clip device. Switchbot is pitching the device as a second brain, allowing users to convert their voice notes into to do lists and daily summaries. The device weighs just 18 grams, so is intended to be extremely unobtrusive. Few details have been revealed so far, with no pricing for the device or the accompanying cloud service. Now one theme that's interesting to me is the degree to which this next generation of wearables is refining their focus and basically just trying to compete in the AI note taking space that is an increasingly standard use case. I wouldn't necessarily say I'm super bullish on this, but as I'm someone who is constantly transcribing voice notes to myself while walking and on the go, and has no problem doing it via phone. But who knows, I might just be too much of a boomer for these new form factors. Overall, I wouldn't be surprised to see some rushed launches of new devices this year trying to get in ahead of the anticipated OpenAI devices, which people expect to come in 2027. Moving over to a very different topic, China is seeing early success using AI as a cancer diagnostic tool. The New York Times reported on a pilot being run in a Chinese hospital where AI is being used to screen routine CT scans for tumors. They highlighted one case where a 57 year old was diagnosed with pancreatic cancer before he had any symptoms. Now, pancreatic cancer is one of the most dangerous forms with a five year survival rate of just 10%. And that's purely because early diagnosis is so difficult. Symptoms typically don't appear until the cancer is in an advanced stage and are unable to be easily treated. Normal diagnostic scans require the use of radioactive dyes, so widespread screening isn't safe. In a highlighted case, AI was used to detect tumors without the use of dyes, unlocking a powerful new screening tool. Now, this particular hospital has been running their clinical trials since November 2024 and have so far analyzed 180,000 scans. Around two dozen cases of pancreatic cancer were detected, with around 14 detected in the early stages. All of these patients were in the hospital with unrelated complaints and weren't initially referred to a pancreatic cancer specialist, Dr. Xu Kaile, who is overseeing the trial, commented, I think you can 100% say AI saved their lives. Now moving on to our next story, One of the fascinating things about AI is the extent to which it blows apart the we wanted flying cars, we got 140 characters Dialectic of Technology Building what I mean by that is that AI is being used from everything from the worst use cases to the best use cases all at once. And unfortunately, on the opposite end of the spectrum from the Chinese cancer diagnosis, AI is the issues that X is having with overly sexualized photos of women and minors. These issues are now getting louder as global governments try to crack down. Over the past few days, France and Malaysia have joined India in condemning XAI for allowing GROK to generate obscene content. On Friday, India's IT Ministry ordered X to fix GROK and restrict the generation of quote, nudity, sexualization, sexually explicit or otherwise unlawful content. The order came after a formal complaint from an Indian politician who complained that GROK was being used to alter images of women without their consent, placing them in bikinis. Grok's mass undressing spree seems to have begun sometime last week and includes examples of minors being depicted partially undressed. In one example, users were able to use GROK to remove clothing from images of 14 year old stranger Things actress Nell Fisher. Elon Musk so far has not addressed the press on the issue, but did post on Saturday, anyone using Grok to make illegal content will suffer the same consequences as if they uploaded illegal content. Notably, the issue doesn't appear to be merely about users getting more brazen with Crocodile, but rather a change in moderation policy in the back half of last year. Grok previously would reject requests to undress images of real people, but those safeguards appear to have been rolled back. Whether this was a deliberate choice or a technical error, we don't yet know. And while XAI executives haven't spoken to the scandal, employee Parsa Tajik did acknowledge the issue and said the team is looking into further tightening our guardrails. Lastly today, a little internal drama for the peanut gallery back there. Yann Lecun doesn't think much of Meta's AI strategy and is quite comfortable telling us so. For over a decade, the Turing Award winner was the leader of Meta's AI division. However, last summer, AI shakeup left him on the outs, finally announcing his departure in November. Speaking with The Financial Times, McCune doesn't believe much will come of Meta's expensive new AI team. Deadpan, he commented, the future will say whether that was a good idea or not. McCune referred to the new AI CEO Alexander Wang as young and inexperienced, adding, he learns fast, he knows what he doesn't know. But there's no experience with research or how you practice research, how you do it, or what would be attractive or repulsive to a researcher. Evidently, Jan fell on the repulsive side, as he complained, alex isn't telling me what to do either. You don't tell a researcher what to do. You certainly don't tell a researcher like me what to do. Now, outside of the psychodrama around the personnel, the interview served as another chance for Jan to reinforce his argument that LLMs ain't it and that world models are key to the next generation of AI. Indeed, isolating a singular reason for his resignation, Lecun said that the entire Meta superintelligence team were, quote, completely LLM pilled. He said, I'm sure there's lots of people at Meta, including perhaps Alex, who would like me to not tell the world that LLMs are basically a dead end when it comes to superintelligence. But I'm not going to change my mind because some dude thinks I'm wrong. I'm not wrong. My integrity as a scientist cannot allow me to do this now. The interview had numerous other bombshells, including an admission that Llama 4 had fudged his words. The benchmarks a little bit. And although ostensibly the interview was about LeCun's new startup, his airing of grievances dominated the coverage. We did get a new name for the company, which is Advanced Machine Intelligence Labs. And we also learned that Alex Le Brun, the founder of French healthcare AI startup Nabla, will serve as co founder and CEO. LeCun also revealed that the startup was targeting a $3 billion valuation in initial fundraising. And while a lot of the discourse ultimately was just about the potshots, I liked Dr. Karim Kaur's take who said, strange that people in tech don't see it as good that someone of Yann LeCun's stature is taking the anti LLM side. Any serious scientific endeavor needs genuine oppositional friction. Everyone being on the same page as how fields stagnate, this is good news for AI. Not bad. I think that's a great take. And that is where we will leave these headlines. Next up, the main episode Foreign sure, there's hype about AI, but KPMG is turning AI potential into business value. They've embedded AI and agents across their entire enterprise to boost efficiency, improve quality, and create better experiences for clients and employees. KPMG has done it themselves. Now they can help you do the same. Discover how their journey can accelerate yours at www.kpmg usagents. That's www.kpmg.us agents if you're using AI to code, ask yourself, are you building software or are you just playing prompt roulette? We know that unstructured prompting works at first, but eventually it leads to AI slop and technical debt. Enter zenflow. Zenflow takes you from vibe coding to AI first engineering. It's the first AI orchestration layer that brings discipline to the chaos. It transforms freeform prompting into spec driven workflows and multi agent verification where agents actually cross check each other to prevent drift. You can even command a fleet of parallel agents to implement features and fix bugs simultaneously. We've seen teams accelerate delivery 2x to 10x, stop gambling with prompts, start orchestrating your AI. Turn raw speed into reliable production. Grade output at Zenflow Free. Today's episode is brought to you by my company, superintelligent in 2026. One of the key themes in enterprise AI, if not the key theme is, is going to be how good is the infrastructure into which you are putting AI in agents Superintelligence Agent readiness audits are specifically designed to help you figure out one where and how AI and agents can maximize business impact for you and two what you need to do to set up your organization to be best able to leverage those new gains. If you want to truly take advantage of how AI and agents can not only enhance productivity, but actually fundamentally change outcomes in measurable ways in your business this year, go to be Super AI. Welcome back to the AI Daily Brief. You could feel basically everyone in the business world right now shaking off the slumber and restfulness of the holidays and starting to get back to work. And I anticipate that it will be almost no time before we start getting big announcements in AI land again. However, as people come back, there have been a couple of conversations recently which I think are particularly pertinent to to the shape of the AI industry and what people are focused on building and implementing in 2026. And one of those is this idea of context graphs. The concept is about what it takes to get agents to do more and more important work, which it turns out, some argues is not just about giving them access to better organized data, but about giving them access to a type of data which right now might not exist. So today we're going to talk about what contacts are and why everyone's talking about it. But we actually have to go back to an essay from earlier in December by investor Jameen Ball, who wrote Long Live Systems of Record. And while his starting point was a debate in the startup and VC world around how agents change systems of record and what it means for the startup landscape, underlying this is actually a much more important conversation about how AI is going to intersect with human knowledge work. The animating idea that serves as our jumping off point comes when Ball writes, if an enterprise workflow needs to know something at a specific step, where is the one place that answer is considered canonical? Because as workflows get more automated and more agent driven, the fragility point often has nothing to do with the model and everything to do with whether the agent pulled the right value from the right system at the right time. Now Ball goes on to dramatize this. He writes, anyone who has spent time inside a large company knows how messy this gets in practice. Take something as simple as what is our ARR? Ask the sales org and you will get one number. Ask finance and you get another with a different set of exclusions and adjustments. Ask accounting and now you're talking revenue recognition, not bookings. Ask legal and they will correctly remind you that half the ARR in a fast growing business is backed by contracts that look nothing like the neat recurring subscriptions you want it to be. Now, he continues, imagine you tell an agent, go calculate ARR by segment and send a deck to the board, which ARR should it use? Which table is canonical? If sales and finance disagree, who wins? If the billing system and the warehouse have drifted by a few percent, which one does the agent treat as truth? The more we automate, the more important it becomes that someone has done the unglamorous work of deciding what the correct answer is and where it lives. Now, Bob points out that even before agents rationalizing and reconciling all this data is something that has been a priority for big companies over the last decade. It's where we got all of these data lakehouse and data warehouse companies. However, in practice, how much these things actually change the way companies operate is up for debate. Ball writes. The problem is that most of this lived downstream of the operational world. The sales team still lived in Salesforce. The finance team still closed the books in Netsuite. The support team still worked tickets in Zendesk. The warehouse or lake house was the retrospective mirror, not the transactional front door. Now agents come along and in his words, change the equation. Agents are inherently cross system, meaning they don't live solely within one of those functions and they are action oriented. They are not just trying to gather information, they are trying to make use of that information to do things. That combination, he writes, means agents are only as good as their understanding of which system owns which truth and what the contract is between those truths. Under the hood, something still has to say. This is the canonical customer record, or this is the legally binding contract term, or this number is the one we report to Wall street. That something might be a traditional system of record, it might be a warehouse back semantic layer, or it might be a new class of data control plane product. But it is absolutely not going away. Now this is where we jump from the systems of record essay into the context graph idea. This was written again by investors Jay Agupta and Ashu Garg from Foundation Capital, which they called AI's trillion dollar opportunity context graphs. Their key observation is that there is actually an entire category of information missing. They write balls. Framing assumes the data agents need already lives somewhere and agents just need better access to it. Plus better governance, semantic contracts and explicit rules about which definition wins for which purpose. That's half the picture. The other half is the missing layer that actually runs enterprises. The decision traces the exceptions, overrides precedents and cross system context that currently lives in slack threads, deal desk conversations, escalation calls and people's heads. The distinction, the authors say, is between rules and decision traces. Rules, they say, tell an agent what should happen in general, whereas decision traces capture what happened in this specific case. I think for our purposes on this episode, an even simpler way to understand it is the what versus why gap. And the simple idea here is that while systems of record are good at state, that is this particular deal closed at a 20% discount, they are bad at decision lineage. Why a 20% discount was allowed this time? As the authors point out, those decision traces, that is the why lives in Slack and DMs and in meetings and in human heads. Limiting how much autonomy can then scale. As the authors put it, agents don't just need rules, they need access to the decision traces that show how rules were applied in the past, where exceptions were granted, how conflicts were resolved, who approved what, and which precedents actually govern reality. Now, the good news, they say, is that agents have a really great ability to collect exactly this sort of information. They write system of agent startups sit in the execution path. They see the full context at decision time. What inputs were gathered across systems, what policy was evaluated, what exception route was invoked, who approved and what state was written. If you persist those traces, you get something that doesn't exist in most enterprises today. A queryable record of how decisions were made. And this is what they call the context graph. The sum total of those decision traces, as they put it. A living record of decision traces stitched across entities in time. So precedent becomes searchable over time. The context graph becomes the real source of truth for autonomy because it explains not just what happened, but why it was allowed to happen again, the what versus why. So, just to make this crisp, here are a couple other examples that they give. One category is exception logic that lives in people's heads. For example, we always give healthcare companies an extra 10% because their procurement cycles are brutal. That's not in the CRM, they point out. It's tribal knowledge passed down through onboarding and side conversations. Another category is precedent from past decisions. We structured a similar deal for Company X last quarter. We should be consistent. Again, this is the common knowledge of the organization that lives in conversations, not queryable databases. Another obvious but important one is cross system synthesis, where a person, a human, looks across data in Salesforce over open escalations. In Zendesk, we used a Slack thread where someone flagged churn risk and ultimately synthesized all of that in their head, decided to escalate, leaving a record that only says escalated to Tier three. The final category of examples they give are approval chains that happen outside of structured systems. And this happens all the time. Your boss happens to be walking by and you ask them if you can add an additional 5% to the discount. They give a thumbs up and keep walking. The record is only going to show the final price, not who approved the deviation or why. The context graph is the sum total of all those decision traces if and as they get captured. So what would it look like to actually have this context graph available and play out in real life? Here's the example they give. A renewal agent proposes a 20% discount policy, caps renewals at 10% unless a service impact exemption is approved. The agent pulls three SEV1 incidents from PagerDuty, an open cancel unless fixed escalation in Zendesk, and the prior renewal thread where a VP approved a similar exemption last quarter, it routes the exception to finance. Finance approves the CRM ends up with one 20% discount. The context graph, however, has all of that information about why. It contains all of those decision traces and once you have the decision records, they write the why becomes first class data. Over time, these records naturally form a context graph. The entities the business already cares about, counts, renewals, tickets, incidents, policies, approvers and agent runs connected by decision events, the moments that matter and why links Companies can now audit and debug autonomy and turn exceptions into precedent instead of relearning the same edge cases in Slack every quarter. The feedback loop, they conclude, is what makes this compound captured decision traces become searchable precedent and every automated decision adds another trace to the graph. Now the big question quite obviously becomes how do you start to map this? Is this something that can only be forward looking or is there any way to go backwards? Is this something that is only going to be relevant as agents come online which can naturally create this context graph and map the why? Or are we talking about adding the mapping of decision traces to the human processes that exist right now? For example, are you asking leaders to talk to a voice agent after making decisions that does that Capturing and categorizing? Now this essay was hugely resonant and tons and tons of people took and ran with the ideas. And one of the areas where people spent the most time is in this question of how best to design these systems. One of the most interesting follow ups came from the cogent enterprise substack who basically argued to not pre constrain the AI in the design of these context graphs. And this a little bit harkens back to the idea that I've shared before of why I think automating existing human workflows is sort of a dead end. Or at least is mostly about short term value. My argument is that ultimately agents are going to find ways to do things that come to the same output differently than the way that humans would do it. And so trying to constrain agents to just doing the exact same things that human did doesn't really make sense. The cogent enterprise substack is arguing something similar about how we think about the design of context graph mapping. They the most counterintuitive development. We shouldn't predefine these context graphs. Traditional knowledge graphs fail because they require predefining structure. Upfront context graphs invert this completely. Modern agents act as informed walkers through your decision landscape. As an agent solves a problem, traversing through APIs, querying documentation, reviewing past tickets, it discovers the organizational ontology on the fly. It learns which entities actually matter and how they genuinely relate through use, not through a manual schema someone designed in a workshop. Each trajectory leaves a trace. Which systems were touched together, which data points co occurred in decision chains, how conflicts were resolved. Accumulate thousands of these walks and something remarkable emerges. The organizational schema reveals itself from actual usage patterns rather than predetermined assumptions. These become world models, not not just retrieval systems. Trying to give this a specific example, you can imagine in a lot of organizations that they have one type of policy that in practice they break almost every time. In other words, the exception isn't actually the exception, it's just the rule. But for whatever reason it's not been codified into policy. Going back to that discounting for health companies example from the original piece, if that happens every time, that's actually not an exception, that's just the policy and practice when it comes to that type of organization. And what cogent enterprise is arguing is that if you allow the agents to figure out this organizational schema from the actual real life experience, they're going to be able to show those policy in practice areas rather than be constrained by the pre assumptions of policy that people would have programmed into them. Now of course the other question that comes up is where humans fit in all of this? Box's Aaron Levy also wrote an essay about similar themes called the Era of Context. One of the big questions that he explored was that if everyone has the same access to talent, that is all of these agentic superintelligences, how do they then differentiate what makes a difference between a good company and a great company? For him it comes back to context and how we design for it. And here Aaron puts crisply why context engineering was one of my key predictions for enterprise AI in 2026, he writes, designing our systems to get agents access to that data and ensuring that all of our agents can interoperate on that data is going to be incredibly important. Further, companies will have to drive a substantial amount of change management to make this all work. We imagined that AI systems would adapt to how we work, but it turns out, due to their extreme power and inherent limitations, we will instead adapt to how they work. This means we will have to optimize our organizations and workflows to best enable context for agents to be successful. The core tenet of this change is that the user is responsible now for directing and guiding agents on how to do their work, ensuring it gets the right context along the way. In essence, he writes, the individual contributor of today becomes the manager of agents in the future. Their new responsibilities will be providing the oversight and escalation paths, a meaningful amount of coordination throughout the work that the agents are doing, and shepherding work between the various agents. Just like managers of teams in the pre AI era, one might say that the decision traces that make up the context graph are the most uniquely human part of of how work gets done. They are the decisions that break the rules, or even if they don't break the rules, technically break out of the patterns by which previous decisions were made. So much of being a good company is about being nimble and responding to reality as it presents itself, not as you imagined it. And it seems to me that as we figure out and negotiate the relationship between agent workers and human workers, it's likely that a lot of the human roles are going to be in these areas of judgment. Anyways, guys, that is a little primer on context graphs. I think it's a concept that you're going to hear a lot more about this year as part of the larger conversation around context engineering. And hopefully you now feel more prepared for now that is going to do it for today's AI Daily brief. Appreciate you listening or watching as always and until next time, peace.
