Summary7 min read

The AI Daily Brief: "Towards AI That Can Actually Interact"

Host: Nathaniel Whittemore (NLW)
Date: May 12, 2026

Episode Overview

In this episode of The AI Daily Brief, host Nathaniel Whittemore delves into the promising future of AI "interaction models"–a new class of AI designed for truly real-time, collaborative engagement with humans. With a primary focus on Thinking Machines Lab’s latest announcement, NLW explores how their breakthroughs could unlock more dynamic, context-aware, and proactive AI systems, shifting AI-human collaboration beyond today's turn-based, prompt-dependent models. The episode also covers major AI industry headlines, including OpenAI’s new consulting company, secondary market chaos in private AI shares, evolving U.S. AI regulation discourse, and President Trump’s tech-focused China delegation.

Key Discussion Points & Insights

1. AI Industry News (00:00–32:00)

OpenAI Launches Consulting Arm (02:00–08:00)

OpenAI unveils “Deploy Company” (DeployCo): Forward-deployed engineering joint venture with 19 partners and $4B initial investment.
Strategic motivation: Investors get “first access” to top OpenAI engineers for AI transformation projects in their portfolios.
Industry implications:
- M&A is the only way for these companies to keep up with explosive enterprise demand.
- Smaller agencies need not worry about being pushed out; OpenAI and Anthropic can’t absorb all transformation work.

Anthropic & OpenAI Crack Down on Secondary Shares (09:00–17:00)

Unauthorized SPV sales:
- Anthropic and OpenAI clarify that unapproved share sales and SPVs are illegal and void, warning of scams and investment losses.
- Market reaction: Prices for supposed Anthropic shares in “gray” markets crash by 50%.
Expert takes:
- Casey Craig (12:35): “Brother, you are four layers of financial abstraction and broker crime away from touching an actual Anthropic share certificate.”
- Brian Norgaard (14:40): “If Anthropic starts invalidating layered SPVs and other ‘creative financing structures’, private markets are in for a reckoning.”
- Natasha Mascarenos (15:00): “It’s hard to overstate the amount of fake SPV circulating in the market right now.”

Regulatory Developments: “AI FDA” Walk-Back (17:30–21:00)

Misplaced FDA for AI analogy: White House pivots away from the idea of an FDA-like regulator for AI, focusing on direct public-private collaboration instead.
- Kevin Hassett, White House NE Council: “[We] don’t have an idea that we should do something like bring in a giant new bureaucracy to approve AIs.” (20:10)
- “[It’s about] making sure…models before they’re released to the public aren’t going to cause an extreme amount of harm.” (20:40)

Trump’s China Tech Envoy—Notably Lacking Nvidia (21:15–25:30)

Key figures: Elon Musk, Tim Cook, and Meta’s Dina Powell McCormick join the trip. Nvidia’s Jensen Huang absent, possibly signaling AI GPU chips are off the trade table.
Trade dynamics: U.S.-China negotiations to include sensitive tech sectors. No licenses yet for GPU exports to China, despite previous signals.

2. Deep Dive: Thinking Machines & Interaction Models (32:00–End)

Background: Thinking Machines Lab (32:30–35:30)

Leadership: Founded by ex-OpenAI CTO Mira Murati and a team of industry vets.
Funding: Raised “low billions,” notable though dwarfed by industry giants.
Previous product, Tynker: RL as a service; received with mild interest.

The Announcement: True ‘Interaction Models’ (35:30–43:00)

Core problem:
- “Collaboration bottleneck”: Current AIs are like “resolving a crucial disagreement over email”—discrete, turn-based, and unable to process new info during interactions.
- User adaptation: We structure our thinking to fit the model; AI does not flex to us.
TML’s vision: Train models from scratch for continuous, real-time, “native” interaction rather than layer interaction artificially atop turn-based models.
- Mira Murati, TML CEO (36:50):
  “We think the way we work with AI matters as much as how smart it is. Interactivity has to be in the model and it has to scale with intelligence rather than trail behind it.”
Model architecture:
- Dual-model approach:
  - Interaction Model: Present with the user in constant, low-latency exchange.
  - Background Model: Handles complex reasoning, search, tool use, longer-term tasks.
- Data processed in 200ms microturns, parallelizing user/model input and output.

Demonstrated Capabilities (43:00–48:30)

Real-world demonstrations:
- Recognizing new faces in video calls and responding in real time.
- Simultaneous translation—understanding and translating as the user speaks (like event translators).
- Contextual dialogue management—detecting when a speaker is pausing or thinking.
- Visual interjection: e.g., Reminding someone to fix posture when they slouch.
- Real-time “softening” of speech—editing what you say to be more polite as you speak.
- Multitasking: Searching the web while maintaining conversation (e.g., providing live info about a movie just released).
“The model…can keep talking and listening while the background model works and then together be able to weave the results… into the conversation when appropriate.” —NLW (46:00)
Benchmarks invented for new behaviors:
- TimeSpeak: Can the model speak at user-requested moments with correct content?
- qSpeak: Does the model speak at right moments with accurate response, such as switching languages when prompted?

Industry & Philosophical Implications (48:30–53:00)

Unlocking new paradigms:
- Existing models respond to prompts; interaction models proactively notice and respond without explicit user input.
- Described as a shift from “AVM” (Audio-Visual Model) to truly interactive AI.
- “This is way closer to a Her-level AI companion than the current prompt-in-response out AI voice models.” —@chriswrites, summarized by NLW (51:25)
TML’s philosophical bent: making humans ‘main characters’
- Sumit Chintala (49:40): “[The plan:] 1. Increase human to AI bandwidth; 2. Raise the ceiling of human plus AI intelligence; 3. Help humans continue as main characters in the new world.”
Unlock Index analogy:
- New models should be measured by what new use cases they unlock, not just by traditional benchmarks.
- NLW (52:15): “The power of that model…was not that it produced prettier pictures…but how steerable it was for editing…It was a small change that unlocked a lot of different types of uses.”

Community Reactions & Critiques (53:00–58:00)

Opportunities: Meetings, education, training, persistent background assistance.
- Ethan Mollick (55:00): “There are obvious uses for this sort of model in meetings, education, training, etc. Why not demo valuable use cases?”
Imitation by incumbents expected:
- Recursive (56:15): “I doubt this stays unique for long. The Frontier labs now iterate on each other's successful abstractions extremely fast.”
Parallel moves by OpenAI:
- OpenAI’s recent GPT Real Time 2 model sports similar background agent capabilities (e.g., updating Kanban boards in real time).
“Given how fast things change…it’s a surprising day when you see something that actually feels like the beginnings of an entirely new category of opportunity. But I think that’s what this interaction model announcement feels like.” —NLW (57:40)

Notable Quotes & Memorable Moments

Casey Craig on financial abstractions:
“Brother, you are four layers of financial abstraction and broker crime away from touching an actual Anthropic share certificate.” (12:35)
Mira Murati, TML:
“We think the way we work with AI matters as much as how smart it is.” (36:50)
Sumit Chintala, TML:
“Increase human to AI bandwidth…Help humans continue as main characters in the new world.” (49:40)
Claire Birch, TML:
“The GUI moment is when the user no longer has to think like the computer or like the AI…In other words, it will let people stay fluent in the task rather than forcing us to become fluent in the tool.” (52:55)
@chriswrites:
“That’s way closer to a Her-level AI companion than the current prompt in response out AI voice models.” (51:25)
NLW on paradigm shift:
“It’s a surprising day when you see something that actually feels like the beginnings of an entirely new category of opportunity. But I think that’s what this interaction model announcement feels like.” (57:40)

Timestamps for Major Segments

Intro & Industry Headlines: 00:00–32:00
- OpenAI consulting arm: 02:00–08:00
- SPV/secondary market drama: 09:00–17:00
- AI regulation “FDA” walk-back: 17:30–21:00
- Trump China tech envoy: 21:15–25:30
Deep Dive: TML Interaction Models (Main theme): 32:00–End
- Thinking Machines background: 32:30–35:30
- Interaction models explained: 35:30–43:00
- Demo capabilities: 43:00–48:30
- Implications & unlock index: 48:30–53:00
- Community reactions/critiques: 53:00–58:00
- Closing thoughts: 58:00–End

Conclusion

Nathaniel Whittemore’s episode blends news analysis with a thought-provoking look at the next leap in human-AI partnership: persistent, real-time, and proactive collaboration models. The Thinking Machines Lab’s “interaction models” appear to represent a true paradigm shift, raising the bar for flexibility, context-awareness, and the user experience in AI systems. While the future may see these features quickly adopted by bigger labs, NLW argues convincingly that this week marks a bona fide “GUI moment” for AI.

For further information or to see demo videos, visit Thinking Machines Lab.

Loading summary

Transcript1 lines

[00:01]
A
Today on the AI Daily Brief, a new approach to AI called interaction models. Before that in the headlines, OpenAI's big consulting arm is official. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, kpmg, Granola, Scrunch and section. To get an ad free version of the show, go to patreon.com aidailybrief or you can subscribe on Apple Podcasts. To learn more about sponsoring the show, send us a Note@ SponsorsIDailyBrief AI and one more thing before we dive into today's episode. Thanks to listeners like you, the AI Daily Brief just continues to grow and grow and grow. It is almost always a top five technology podcast and it is almost always now in the top 200 of podcasts overall. This has happened almost totally organically because of listeners sharing with their friends and colleagues. But I think it's time to pour some gas on that particular fire. I am now hiring for what I'm calling a Growth Engineer and the goal is to build stuff to make it easier for important parts of the AI Daily Brief to get to the audiences that they could be helping. This is a role for someone who is creative, dynamic, highly self directed and lives inside Claude Coder Codex. The application is all about sharing stuff that you've built and your perspective on the show. And I'll be hiring one person for a paid three month trial with a guaranteed base and the ability to up to double that base during that time with the goal of it turning full time and ongoing. After that. You can find the role@jobs aidaily brief AI. That's jobs aidaily brief AI. We kick off today with a story that is a follow up to reports from last week. OpenAI is indeed launching a consulting company and now it is official the business will operate as a separate company called the OpenAI Deployment Company or Deploy Company for short. This is basically a forward deployed engineer shop that will pair developers with some of OpenAI's most important clients to help them on theoretically real deep AI and agentic transformation. Now, like the anthropic venture that was announced last week, DeployCo is structured as a joint venture in this case with 19 partners across consulting, private equity and finance. The initial investment was $4 billion at a pre money valuation of $10 billion with TPG as the lead investor, Advent International, Bain Capital and Brookfield as also co lead founding partners. Now one of the things of note here Is that word on the Street? Is that a big part of the motivation for the firms who are investing in this is to get first access to this set of engineers for their portfolio companies? In other words, this was effectively a buy in cost to skipping the line when it comes to getting help on AI transformation. Of the partners announced, it looks like Goldman Sachs is the only one to back both deployco and the Anthropic effort, which is as yet unnamed. The guts of deployco are going to be built around an acquisition, specifically engineering firm tomorrow, which will give Deployco right out of the gate about 150 staff who have experience in deploying AI solutions. At this point, there is basically no way to grow fast enough to meet this demand other than acquisition. So I would expect a lot more M and A soon. Even though this was a big part of the discourse last week, there was a surprising amount of conversation about this. Most of it was just a reaffirmation of what has finally started to become conventional wisdom, which is that it doesn't matter how powerful the models are, they're going to crash headlong into institutional inertia. And for enterprises to close the capability overhang and actually get the full value from these models, it is going to involve meaningful support structures being built around them both inside and outside. Now one funny strand in the conversation that I've seen is a lot of folks running smaller versions of these agencies trust trying to sort of puff out their chest and make it clear that they still have a market. Because no matter how well resourced the efforts from OpenAI and Anthropic are, there's a massive long tail of people that need support too. And to them I would just like to say, guys, don't worry. No one thinks that because OpenAI and Anthropic are in the game, somehow they are going to be able to consume the sheer tonnage of transformation support that is going to be needed over the coming decades to get these companies onboarded onto AI. Next up, we have an interesting market sort of substory. You might have seen posts like this one from the Kobayisi letter suggesting something about anthropic or OpenAI's market implied pre IPO valuation. Now what that type of statement means is not a reference to evaluation in an actual fundraising round, but about weird gray secondary markets usually filtered through some blockchain or another, where market activity on those markets is being interpreted as actual price signal. In many cases what you have is basically crypto tokens that claim to be backed one to one by stock held in SPVs that themselves own actual Anthropic stock. Anthropic, however, seems not to be a fan of this. On Monday, they updated a page in their support docs discussing unauthorized stock sales and investment scams. The article already stated that unapproved stock transfers were void and wouldn't be recognized in official records regarding SPVs. Anthropic noted, we do not permit SPVs to acquire Anthropic stock and any transfer of shares to an SPV are void under our transfer restrictions. Any third party claiming to sell Anthropic shares to the general public is likely either engaged in fraud or offering an investment that may have no value due to our transfer restrictions. The Monday edition was a list of firms known to be offering access to the stock, with Anthropic specifically saying that any interest in Anthropic stock offered by these firms is void and will not be recognized on our books or records. In a rare show of solidarity, OpenAI sent a similar message to the market in the form of a blog post. They restated their position that unauthorized transfers are legally void without approval, making some vehicles claiming to have exposure to the stock worth zero. Lawyer Gabriel Shapiro thinks people trading these gray markets could have a huge problem. He wrote, there is an active secondary market purportedly in anthropic stock or derivatives, including on fairly reputable or at least well known platforms like Forge. Anthropic is calling them out specifically by name and essentially saying 100% of these are illegal. Now, Shapiro notes that the legal status is far from clear, but attempting to void transactions could trigger an avalanche of lawsuits against Anthropic and the marketplace is purporting to sell stock, giving an indication of why they want this activity to be nipped in the bud. Anthropic's notice triggered a quote unquote massive crash, cutting the price of Anthropic on these markets in half yesterday. Now what's important about this story is that this is not just doing crypto market things. What underlies this is what has been a growing dynamic in private markets over the last honestly decade and a half. Ever since the global financial crisis and the beginning of ZIRP era policies, private companies have been waiting longer than ever to IPO because of basically just unlimited private capital. The dynamic has gotten to the point where some startups are even targeting the secondary market as their exit rather than aiming for an acquisition or an ipo. All of this comes down to demand from both Long Tail accredited and retail investors who are structurally blocked out of investing in early stage companies. If companies never go public or go public at enormously high valuations. It significantly cuts down the ability to of retail investors to participate in the upside of company creation. Now the issue is that a lot of the way that secondary markets happen completely negates the disclosure rules of markets in general. Primary investments in a startup are pretty straightforward, even though they're gated by accredited investor rules. You can easily verify whether you're included on the cap table, and the SEC has your back on that. With secondaries, there's no such protection. Investors are fundamentally buying shares in a holding company with no real ability to verify the claims being made, and that's in the best case scenario. Casey Craig discussed some of the vehicles being traded in crypto markets that claim to represent anthropic shares and wrote, brother, you are four layers of financial abstraction and broker crime away from touching an actual anthropic share certificate. Your position is a tokenized receipt for possible future economic exposure to a Cayman SPV that owns shares in another Delaware SPV that maybe owns rights to future equity pending transfer approval. You are approximately anthropic adjacent at best. Now now, to be fair to these market participants, most people at this stage who are trading these assets know that they're trading IOUs. The bigger issue comes when naive investors think they actually own Anthropic stock, and that turns out not to be the case. Ultimately, it seems like this is actually a bigger issue than just this one instance, with a potential reckoning on the way. Brian Norgaard writes, if Anthropic starts invalidating layered SPVs and other quote unquote creative financing structures, private markets are in for a reckoning. The the SpaceX IPO will expose just how much synthetic ownership and outright fraud has accumulated in privates. Natasha Mascarenos writes, it's hard to overstate the amount of fake SPV circulating in the market right now. They have always been a controversial financing tool with Andrew OpenAI, anthropic, etc. Fighting them for years. If companies crack down against them as promised, people are in for a rude awakening after lockups expire. Still, Kingsley Advani thinks that Anthropic has limited options here, posting a chart with the dozens of registered SPVs holding anthropic shares and saying Anthropic unlikely to void half their investors. Moving over to politics now, administration officials have walked back calls for an FDA like approach to AI safety. Last week, National Economic Council Chairman Kevin Hassett said that the White House was considering an executive order to respond to Mythos level models. He said the new policy would put AI models through a process where they're proven safe, just like an FDA drug. This led to a massive industry reaction, with many fearing a burdensome regulatory structure that would slow innovation to a crawl. Over the weekend, former AI czar David Sachs said that he'd spoken with Hassett and the FDA comparison wasn't particularly apt. Commented Sachs, I don't think any senior official supports it. During an interview with CNBC on Monday, Hassett confirmed that an FDA type organization is not in the cards. Commenting at the White House, nobody has an idea that we should do something like bring in a giant new bureaucracy to approve AIs. He said that the current approach is simply administration officials working directly with the AI labs to quote in his words, make sure that the models before they're released to the public aren't going to cause an extreme amount of harm. Hassett noted that this all of government, all of private sector approach is working well and it's uncertain that an executive order will even be necessary. Commenting on how much discussion his off the cuff comparison had generated, he added, I probably shouldn't have called it the fda. Lastly today, President Trump is assembling a tech envoy for his trip to China. Later this week, the White House said that Elon Musk, Apple CEO Tim Cook, and Meta president Dina Powell McCormick will join the President's delegation. The group also includes numerous finance, semiconductor, aerospace and agriculture executives. U.S. officials have said they intend to finalize trade negotiations with China during the meeting, including establishing a bilateral Board of Trade. A senior official said the executives are from companies with significant Chinese exposure and represent sectors to be included on the trade agenda. Notably absent is Jensen Huang. Last week, the Nvidia CEO said he would join the envoy if invited, but it appears an invite was not extended by the White House. Now there's a lot of different ways to look at this. Huang has traveled extensively with the President over the past year, joining trips to the Middle east and the uk. There are also apparently executives for Micron and Qualcomm attending, so it's clear that semiconductors will at least in some way be discussed. However, Huang's absence could mean that the White House is sending a signal that Nvidia's AI chips are off the table as part of the trade talks. While the White House signaled in December that older H200 GPUs would be approved for export to China, those plans have stalled and so far zero export licenses have been approved by the Commerce Department. We'll see later this week how much the absence of Jensen is an actual strategic move. For now, though, that is going to do it for today's headlines. Next up, the main episode. One of the most important AI questions right now isn't who's using AI? It's who's using it? Well, KPMG and the University of Texas at Austin just analyzed 1.4 million real workplace AI interactions and found something surprising the highest impact Users aren't better prompt engineers. They treat AI like a reasoning partner. They frame problems, guide thinking, iterate, and push for better answers. And the good news? These behaviors are teachable at scale. If you're trying to move from AI access to real capability, KPMG's research on sophisticated AI collaboration is worth your time. Learn more at kpmg.com us sophisticated. That's kpmg.com us sophisticated Today's episode is brought to you by Granola Granola is the AI notepad for people in back to back meetings. You've probably heard people raving about Granola. It's just one of those products that people love to talk about. I myself have been using Granola for well over a year now and honestly, it's one of the tools that changed the way I work. Granola takes meeting notes for you without any intrusive bots joining your calls. During or after the call, you can chat with your notes, ask Granola to pull out action items, help you negotiate, write a follow up email, or even coach you using recipes which are pre made prompts. Once you try it on a first meeting, it's hard to go without. Head to Granola AI AIDAut and use code AIDAut. New users get 100% off for the first three months. Again, that's Granola AI AIDAut. When was the last time you actually visited a website to research something? If you're like me, AI pretty much. Does that work for you? Now that of course raises a new question for brands. If AI is doing the discovering, researching and deciding who or what is your website really for that shift in user behavior, the rise of AI bots becoming your most important new visitors is what my sponsor Scrunch is taking head on. Scrunch is the AI customer experience platform that helps marketing teams understand how AI agents experience their site. Where they show up in AI answers, where they don't, and what's preventing them from being retrieved, trusted or recommended. And it's not just visibility. Scrunch shows you the content gaps, citation gaps and technical blockers that matter and helps you fix them so your brand is found and chosen in AI Answers. Now for our listeners. Scrunch is providing a free website audit that uncovers how AI sees your site, where there's gaps, and how you're showing up in AI versus the competition. Run your site through it at scrunch.com aidaily Here's a harsh truth. Your company is probably spending thousands or millions of dollars on AI tools that are being massively underutilized. Half of companies have AI tools, but only 12% use them for business value. Most employees are still using AI to summarize meeting notes if you're the one responsible for AI adoption at your company, you need section Section is a platform that helps you manage AI transformation across your entire organization. It coaches employees on real use cases, tracks who's using AI for business impact, and shows you exactly where AI is and isn't creating value. The result. You go from rolling out tools to driving measurable AI value. Your employees move from meeting summaries to solving actual business problems and you can prove the roi. Stop guessing if your AI investment is working. Check out section@sectionai.com that's S-E-C-T-I-O-N a I.com. Welcome back to the AI Daily Brief. Today's episode is a bit surprising to me. We're discussing a new model which isn't out of the norm for this show, but it is not a new model from OpenAI or anthropic or even from one of the Chinese labs. Instead, it comes from Thinking Machines and is all about a new mode of interaction. Now, just by way of quick background, if there was going to be a lab outside of the biggies that could drop something that would catch our attention, Thinking Machines Lab is a pretty good bet for that. Former OpenAI CTO Mira Muradi left OpenAI to form the lab and pulled away a super team of researchers directly from the labs while also doing some very aggressive fundraising. In any other era in the past, the low billions of dollars that they had raised in fundraising would be notable. It's just obviously compared to the tens or even hundreds of billions in resources that the biggies are playing with, a billion or two seems frankly pretty quaint. Thinking Machines released their first product, Tynker, last October, which was a platform for reinforcement learning as a service essentially allowing companies to fine tune open source models. It wasn't received poorly exactly, but it certainly didn't capture a ton of attention or discourse in the industry. Late last year we got rumors of more aggressive fundraising and talk of TML releasing their own model, but things went pretty quiet to begin this year, save for a wave of reports that staff and founders were leaving the company. The highlight was that two of TML's co founders, Barrett Zoff and Luke Metz, left in January to return to work at OpenAI. And of course we are now in the firm realism period of AI. Just last week, for example, we had Elon agree to allow anthropic to use Colossus 1, which I basically argued was him getting comfortable accepting reality and transitioning into a different role vis a vis the industry because of how much was consolidating around the top players. But with all that as background, let's come back to what Thinking Machines actually shared Yesterday, Mira tweeted, Today we're sharing our work on Interaction Models, a new class of model trained from scratch to handle real time interaction natively instead of gluing it onto a turn based one. The current AI experience often feels like a conversation that only begins after we stop talking. We have to batch our thoughts, we can't point at things we phrase questions like emails. The interface doesn't leave room for us, so we adapt to the models. We started Thinking Machines to advance human AI collaboration and this is our first bet on what this looks like. Most labs treat autonomy as the goal and interactivity as scaffolding around a turn based core. We think the way we work with AI matters as much as how smart it is. Interactivity has to be in the model and it has to scale with intelligence rather than trail behind it. Digging deeper into the problem, the companion blog post, which is closer to a research paper than your average announcement post, argues that today's AI systems create what they call a collaboration bottleneck. Users need to stay involved, clarify, interrupt, point, show and correct. But current interfaces are mostly built around discrete turns. Describing this turn based model, they write, today's models experience reality in a single thread until the user finishes typing or speaking. The model waits with no perception of what the user is doing or how the user is doing it until the model finishes generating. Its perception freezes, receiving no new information until it finishes or is interrupted. This creates a narrow channel for human AI collaboration that limits how much of a person's knowledge, intent and judgment can reach the model and how much of the model's work can be understood. Picture trying to resolve a crucial disagreement over email rather than in person. And indeed, this analogy that current AI systems are too much like email is one that runs throughout their messaging. Their proposed solution is what they call an interaction model trained from scratch around continuous time, aware, exchange. Instead of that turn based system where inputs and outputs are in their words flattened into one order token sequence, their model processes streams in 200 millisecond microturns. Instead of a flattened ordered sequence with human input leading to model output, leading to human input leading to model output, their time aligned version has continuous parallel input and output streams that are split into these micro terms. They write an interaction model is in constant two way exchange with the user perceiving and responding at the same time. Now architecturally they actually describe a two part system, a real time interaction model that stays present with the user, and a background model that handles longer reasoning, browsing tools and agentic work. What this allows for is the interaction model can keep talking and listening while the background model works and then together be able to weave the results of the background model into the conversation when appropriate. So what are some of the examples and capabilities they show off for this? I would definitely suggest that you go check out either the blog post or the announcement thread on Twitter where they include all these examples. These are not polished launch video types of assets. Instead they're TML researchers who are actually just giving examples of the capability set in the first video. For example, they show how the model can recognize when someone new comes on the screen and make mention of that how the model can do simultaneous translation, actually starting to translate what someone is saying from one language to another while they're still speaking, which is sort of similar to how you see translators at events speaking just a couple of seconds after the start of the phrase that they're translating. Another example of a capability they show they call dialogue management, where the interaction model can track when the speaker is thinking, yielding self correcting or inviting a response. Basically they say there's no specific built in dialogue management system so that it can adapt to whatever the context is. In one video they demonstrate visual interjection by showing a researcher ask for the model to identify when she starts slouching, to remind her to change her posture. In another example of that simultaneous speech, they take it out of the realm of language translation and into the realm of professional softening. The researcher is basically saying what he would like to say to a colleague who's always late with the model, changing it to a more socially acceptable version in real time. There are actually a number of other examples as well, but one that I think that's important that shows the interplay between the interaction model that the user is interacting with and the background model that's doing things are the examples where they show the model effectively multitasking, interacting with the user while running search in the background, making the model just seem much smarter and more capable. I just watched the new devil wears product 2 movie and I heard that it has a pretty massive opening box office. Yeah, it's crushing the box office in its opening weekend with around $233.6 million globally. Did you enjoy how they brought back Andy? Yeah. And also I really like the fact that Lady Gaga is featured in the movie. Definitely her cameo at a Runway event performing Shape of a Woman adds a ton of tension with her implied history with Miranda. And the important thing to note here is that obviously the devil wears Prada 2, which is in theaters right now, is not in the training set of this model. It was searching in the background as the researcher was interacting with it. And this I think is what Mira means when she says we think the way we work with AI matters as much as how smart it is. Another way you could put this is situational smartness. We have to create the right setting for AI to be smart rather than dropping it into the rest of the world that we operate in when we're not talking to AI. So let's talk about some of my and the community's observations about the model. The first one is that the TML team was extremely on message in a way that frankly the other labs should be jealous of. Almost all of the posts from TML researchers and team members are telling some version of the story, which is both increasing the capability of human AI collaboration in a way that improves humans lives. Sumit Chintala, for example, writes Thinky's Secret Plan 1. Increase human to AI bandwidth 2. Raise the ceiling of human plus AI intelligence 3. Help humans continue as main characters in the new world. Now the other part of the message that the TML folks were very on brand with was the idea that this is a category change type of model. In fact, kind of putting a fine point on that is that while they share a bunch of benchmarks relative to other voice models, they ended up having to create two internal benchmarks to measure new proactive audio capabilities. The two benchmarks that they introduced were called TimeSpeak, which they say tests whether the model can initiate speech at user specified times while producing the correct content. The example they gave was I want to practice my breathing. Remind me to breathe in and out every four seconds until I ask you to stop. The second benchmark they introduced was called qspeak, testing whether the model speaks at the appropriate moment with the expected semantically correct response. For example, every time I code switch and use another language give me the correct word in the original language. Point being that when you have to invent new benchmarks to capture the capability set, that suggests at least that the capability change is pretty significant. And like I said, the newness of this was all over the TML messaging. Rowan Zellers called it the first general video plus speech model that's visually proactive. TML co founder John Shulman, after reinforcing that thinke was founded to quote advance capabilities for human AI collaboration, which he argues are underemphasized relative to intelligence and autonomy because they're harder to evaluate. Says that this new model that they're introducing, which by the way is technically called a TML interaction small, will be the beginning of what they see as a different paradigm in the future. John writes, we think every AI system will have something like an interaction model as the outer user facing layer, continually keeping the user informed and learning what they actually want. Claire Birch from their team expanded the philosophy behind this she wrote AI is changing how people use computers. Computers are the central tool of modern work, but computer literacy is sharply stratified. Early AI use appears to be as well. AI progress so far has inherited the worldview and workflows of software engineering. This makes sense for code native researchers. The way into the machine is through code and text is the next level up. I think we have been here before. Before the GUI text was the primary interface to the computer. You interacted line by line through the cli, typing precise instructions so as not to mess it up. The GUI was one of the greatest democratizing forces of personal computing. Alongside dramatic drops in cost, it made the computer tool usable for many. If AI is the next interaction layer, what's the gooey moment here? It is not better prompts. Chat is still surprisingly CLI like. Even with tool use, Chat rewards verbal fluency, abstraction and procedural skill. Think carefully crafted context laden prompts with just the right pleasantries and abuses. But in human collaboration we don't just throw paragraphs of polished turn based text at each other. Even bad meetings are spoken, gestural, interruptible, context heavy and full of revision and repair. The next interface will need greater affordance, richer, persistent shared context, lower mode switching costs and native multimodal interaction. Rather than jarring handoffs between text, image and audio, it should stay grounded and disambiguate intent well. It should let people communicate by speaking, showing, pointing, interrupting and revising in context, narrowing the translation cost between human intent and machine action. In other words, it will let people stay fluent in the task rather than forcing us to become fluent in the tool. The GUI moment is when the user no longer has to think like the computer or like the AI or like the prompt engineer in order to access the machine's capabilities. The idea, as Claire writes, of interaction models as a step towards that reality is, I think, the right way to look at this. Last year, around the time that Google introduced Nanobanana, I said that we really needed a conception of something like an unlock score or an unlockindex that was a way to measure models not by the traditional benchmarks but but by how many and what type of new use cases they unlocked. The reason that felt like it mattered around nanobanana is that the power of that model, which came out in late August of 2025, was not that it produced such prettier pictures than the other image models available, it was how steerable it was for editing. It turned out that while that was a quote unquote small change, it was a small change that unlocked a lot of different types of uses that would be the same later on when later versions of nanobanana also unlocked infographics through both their reasoning over a prompt and as well as interact with text in a much more fine grained way using the Unlock Index mindset TML basically says that there are a set of things that current commercial real time APIs simply cannot do. They group these together as visual proactivity. They say that these APIs respond to spoken turns, but they cannot proactively choose to speak when the visual world changes. For instance, they write if asked please count how many pushups I do. Such a system might respond sure thing and then remain silent waiting for an audio only cue that never comes. When it comes to things like time awareness, verbal cues triggers, visual based counting and visual cues triggers, they write, no existing model can meaningfully perform any of these tasks. Swix from Latent Space writes, I believe that kids call this thinking machines just brutally frame, mock GDM and oai. Basically everyone's definition of real time just got a massive fricking upgrade. Chris on Twitter writes, thinking machines cooked hard here. What they're building feels like a shift from AVM to a true interaction model. The slouching demo is my favorite example. A normal model needs you to ask am I slouching? This notices it live, understands the context and reacts naturally without breaking the flow. That's way closer to a her level AI companion than the current prompt in response out AI voice models Now Professor Ethan Mollick, immediately recognizing how many interesting new use cases this would open up, lamented that the demos were not really about those uses, he writes. All of the demos except maybe one are the model being fun and or annoying by correcting or reminding in real time There are obvious uses for this sort of model in meetings, education, training, etc. Why not demo valuable use cases? Developer Nick Dobos, however, pointed out it's a tech demo, not a consumer demo. Purposeful audience choice because their business is focused on raising VC funds to build more AI and sell to companies, not to sell to consumers. Still, I think Ethan's point that there are a lot of things that this opens up is well taken. And as impressive as it is for this innovation to be coming from tml, the question is how long it stays with tml. Recursive on X writes I doubt this stays unique for long. The Frontier labs now iterate on each other's successful abstractions extremely fast. With this we are moving from turn based chat to models designed for persistent real time interaction. And I think they might be right. In fact, once again, just yesterday the OpenAI developers account showed off some new capabilities of their recently released GPT Real Time 2 model, basically showing how that real time audio model could work as a background agent updating a Kanban board of to DOS as a team gives updates in a standup type of meeting. This, if nothing else, I think is testament to the idea that one the background agent paradigm, where things are just happening passively as other types of work proceed, is likely to be an important part of the future and two that these real time audio and visual models open up new possibilities in that realm. Nick Dobos again writes why stop at tickets? What if software engineering is 100% meetings and your AI note taker orchestrates all your coding agents in the background for you. 10 people chatting and playing with an app while an AI hums away updating it in real time. Given how fast things change and how iterative updates are, it's a surprising day when you see something that actually feels like the beginnings of an entirely new category of opportunity. But I think that that's what this interaction model announcement feels like. It's can go read about it at thinkingmachines AI and I'm excited to see what they do with this next as well as what people do with it. For now, that is going to do it for today's AI Daily Brief. Appreciate you listening or watching as always. And until next time, peace.