Loading summary
A
Foreign.
B
Welcome to the Last Week in AI podcast where you can hear us chat about what's going on with AI. As usual in this episode we will summarize and discuss some of last week's most interesting AI news. You can head on over to last week in AI.com for the links to all these stories. You can also go to the episode description for the timestamps and so on. I am one of your regular hosts, Andrei Krankov. I studied AI in grad school and now work on it in a startup. And once again this week Jeremy is busy. Unfortunately Jeremy has been very busy lately so he's not been around. But I once again have a great co host with me, Michelle Lee.
A
Hey everyone, I am Michelle Lee, your guest host for for the week I went to grad school with Andre, also studying AI and now I am the founder and CEO of Medra, which is a physical AI startup based in San Francisco.
B
Right. And we can kind of do a quick bit of news on that. You just announced your big launch milestone for the company, so go ahead and feel free to let people know more about medrerab for a bit.
A
Metro AI We're a physical AI company for life sciences. We're building the physical AI infrastructure that powers the scientific frontier. Our physical AI platforms can do lab work inside life science companies and generate a lot of experimental data which then in turn can help train Frontier and Foundation models in sciences and also help our partners be able to find cures to disease faster.
B
Yeah, so working with a bunch of robot arms, I just saw them recently doing a lot of pipetting and whatnot. I don't really know the details. Somewhat in FEMI vat we'll have in this episode quite a few stories about robotics. Actually there's a lot going on with humanoids, a lot going on with self driving cars, even with foundation models. So you can look forward to some discussion on that front. Before we get into it real quick, I do want to acknowledge for regular listeners the output has been inconsistently lately. As I've said, Jeremy has been busy with work and whatever else is up to. So as always I promise to try and make it more consistent. But just bear with us please. Let's go ahead and start with the news in tools and apps. The first story is OpenAI upgrades codecs with a new version of GPT5. So I now have GPT5 codecs which is just GPT5 but better for coding is what it sounds like. It is now available if you're using the codec CLI or the codecs kind of IDE tool, you can switch from using regular GPT5 to GPT5 codecs and also if you're using Web Agent, it's now powered by GPT5 codecs. So pretty significant news given that it looks like OpenAI is trying to catch up to Anthropic and be a competitor to cloud code. Where they're a little bit behind is my impression.
A
Yeah, I think right now definitely talking to software developers, the general consensus is that cloud code is still the best tool out there. So it's very interesting to see OpenAI release new tools and better tools to make it more powerful for coding tasks.
B
Yeah, and people have been a little angry at Anthropic lately due to infra issues. So from like a business strategy, I think OpenAI is a real opportunity to get some converts. And if you go look on Reddit or Twitter, there's a bit of sentiment of like, oh, I've switched to Codex now, it's great, I'm trying it out. I'm not a convert yet, but I might become one. Next up we have Google injects Gemini into Chrome as AI browsers go mainstream. So pretty much what it sounds like, they now have a version of Chrome where on the top right there's a little Gemini button. You click on it and you can ask questions about the tab. Talk to Gemini potentially later going to more agentic tasks. Very much in line with what we've been seeing from Perplexity from the browser company, like integrating chatbots into the browser. Also Anthropic recently had their chat their Chrome extension, so it seems like just a matter of time till this happened. Actually it took Google a bit long to do this, if anything, but definitely taking us towards the future where you just have a chatbot literally in every single single piece of software you ever use.
A
Yeah, I wonder if it took them a while because of also all the competitiveness issues that Google is facing with Chrome because this is definitely gives them a huge competitive advantage that they own both the browser with Chrome one of the most popular browsers and also now can integrate that directly with their AI.
B
Yeah, yeah, right. Perplexity did try to buy Chrome or make a bid for it, so I guess that tracks.
A
Yeah. Have you tried using any of these browser based AI?
B
Actually I did use ChatGPT agent a little bit. So ChatGPT agent isn't like browser plugin or anything, but it does browse the web for you and do stuff and I found it to be pretty powerful, like things that you could not do otherwise. It can go and open your Google Doc and click on links and go and do it for like half an hour, which is pretty impressive. So I could see these being like an even easier way to automate stuff you do via prompt instead of anything else.
A
Yeah, that's interesting. I tried DIA for a little bit and just didn't find it smooth enough really. Or didn't find that it brought enough value. But I'd be interested in checking out Gemini directly in Chrome.
B
And next we go to Anthropic. They have a new feature in Claude. It can now make you spreadsheets or PDFs, which I think is actually pretty different. Like I don't know that ChatGPT or others can make. It seemingly can do PowerPoints.
A
I don't know about spreadsheets, but because OpenAI has such a strong collaboration Microsoft, I believe they were able to roll out a lot of features with Microsoft. 365 pretty early.
B
Oh nice. Well, in cloud now there's an experimental feature called upgraded File Creation Analysis. It sounds like we might be running like a little cloud code agent within it that if you upload a file it can do agentic stuff to it. So yeah, if you are working with spreadsheets or PowerPoints or PDFs, this should really make Claude more powerful for that. And just one last story in the section. We've got a new video model. This one is from Luma and it is their ray free model. What they are saying is an AI reasoning video model which is kind of interesting. They say it's using reasoning power to create AI video clips with more complex action sequences. I don't know if that means that it interleaves video creation with reasoning or pretty steady kind of progression in video creation for the last year or two. Now you're able to get clips in 20 seconds here and upres them if you want. As with any of these video models, you really have to go and look at the previews to see the improved clarity, prompt visibility, all those kinds of things.
A
Yeah, interesting that it calls itself its first reasoning video model because I highly doubt that all the other video models don't use reasoning at all.
B
Yeah, it's hard to know to what extent this is kind of marketing speak and to what extent this is architectural or other things like that. This is coming I guess with Google having released VO3 I don't know how long ago, but not too long ago and being very impressive and very powerful. So it's getting increasingly competitive definitely onto applications and business and as usual Or I guess as often as the case, we begin with OpenAI having some very businessy kinds of updates. So for the past year or something like that, we've been trying to go for profit as we've covered over many months, have had many legal struggles and now there's a bit of an update on it. Apparently OpenAI secured Microsoft's blessing for the transition to the for profit. So they have now this memorandum of understanding, so kind of a unofficial agreement, so to speak, where they have terms that they are agreeing upon, where they will retain some sort of relationship. But it'll be not quite as exclusive as OpenAI at Microsoft has had, I guess, prior to 2025. We've seen them become a little more antagonistic over time as we've tried to transition to for profit. Let's see, is there anything else to say here?
A
No, it's like memorandum of understanding.
B
I don't know any details here.
A
Yeah, it just sounds like some interesting updates, maybe, maybe to help with fundraising, maybe to just produce some more news.
B
Yeah, there's not really many details here. All we know is apparently this is ending months of negotiation and this was stated in a joint statement. So presumably behind the scenes this involved a lot of back and forth and is kind of a significant update for OpenAI because they are under the gun to do this forward transition. They announced wanting to do it early this year. They still haven't done it. You know, they're in a tough spot and if they don't complete this transition, they're in real trouble. And related to that, we have the next story. Microsoft is going to apparently lessen its reliance on OpenAI by buying AI from Anthropic. So they are going to integrate anthropic into their Office365 applications. Presumably it's going to be kind of a way to pick your models as you use AI. And at least according to this article, and presumably like reasonable speculation, this is related to whatever tensions are currently existing between Microsoft and OpenAI.
A
Well, maybe this makes sense why the new cloud models now can work with Office 365 applications. And it looks like also OpenAI is also working to reduce their dependency on Microsoft by also working with AI chips, other cloud providers. So it sounds like both parties are trying to lessen their reliance and lessen their partnership.
B
Yeah, that's right. Yeah. OpenAI did just sign a massive contract with Oracle, Oracle's stock to jump quite a bit. So as OpenAI, we've covered a lot on this podcast. If you're like into business drama. OpenAI is a never ending fountain of business drama and sort of interesting developments. And this is just latest to that. Moving on to slightly less, let's say boring businessy news. We've got some stories on robotics. First up we have figure AI. They are passing $1 billion in committed capital. Their series C funding round, which would make their post Money valuation $39 billion figure, is one of the several humanoid robotics startups that are fairly new. I forget how old figure is, but they must be from 2023ish. Obviously pre revenue. They're still more or less an R and D lab at this point. So pretty cool to see the venture fund still being committed to funding these very ambitious humanoid robotics bets that are seemingly making a lot of progress. What I can see and maybe Michelle, you have a take on this?
A
Yeah, I mean we've been seeing really exciting new models come out of physical intelligence. Dyna Robotics just launched with their new fundraise. So very exciting to see more funding going into robotics and also very exciting to see especially more and more efforts into hardware which has been very much a bottleneck right now in robotics. How do we actually have better hands, better humanoid robots? 6 years ago if you wanted a humanoid robot to do research, you would have to be in a select few universities around the world that actually have access to humanoid robots. And now we have several humanoid companies all trying to build better hardware. So it's very exciting. Which I guess also leads to the next news.
B
Exactly. Yeah. The next news is about China's Unitree which is already planning an IPO apparently. So they are saying that the company might be valued at up to $7 billion. Unitree, if you haven't seen lately, has been big in humanoids favorite unveiled this kind of mini humanoid that is quite affordable and quite capable. And I believe they also were pretty active in the like dog quadruped space, which is China has been killing it for quite a while now. Even they have been profitable since 2020 actually with revenues now exceeding like $140 million. So China is, especially on robotics front, quite competitive with the frontier of AI. But there is a question, I think here of the software AI side where it's still very tough.
A
Yeah, I mean it's very cool to see China focusing more and more on humanoids and in general in robotics. I heard that there are just dozens of humanoid companies, not just Unitree, that have been founded in China and this is great for the robotics industry. As more companies are building hardware, the cost of these general purpose hardware keeps going down. And with Unitree with their new humanoid robot which is very affordable, it costs around the same price of a robotic arm. Most labs now in the US and university can very easily afford their own humanoid, which again was just not true several years ago.
B
Right. Apparently it's what, $16,000 for this unitree G1 robot which is actually on the lower end of robotic arms from what I've heard. So very cool. Onto another type of robotics, Robotaxis also a very hot area this year. First up, we've got a couple of stories about Tesla. First of all, Tesla's Robotaxi is planning to test in Nevada. They now have testing permit from Nevada's Department of Motor Vehicles. On the slightly less positive side there was reporting from Electrek about there haven't already been three Robotaxi accidents with at least one injury reported as well. And this is from the Robotaxi fleet in Austin which is estimated to have about 12 vehicles still on a very kind of small scale with safety drivers. So if that is true, apparently the NHTCA is investigating Tesla for potentially misreporting the crash data. That would not be a good news. Good sign for it in trying to compete with Waymo which has a stellar record from what I know at least.
A
Yeah, it's very tricky for these companies when they try to avoid or try to hide these accidents because that was really what got crews in trouble in San Francisco was when after the accident they tried to hide information and many reasons why Cruise no longer was able to operate in San Francisco. So I hope Tesla is able to be honest and report the accidents correctly so that they can continue building that trust with the government officials.
B
Yeah, for sure. And Robotaxi does seem pretty capable. I'm a major Waymo user. I don't know how often you use it, Michelle, but I'd be looking forward to try Robotaxi whenever I come here.
A
I mean I love Waymos and it really truly feels like magic. And so very excited to see more and more self driving cars in the streets.
B
And on that note, one more story about Robo taxis. Next up we have Amazon's Zoox jumps into us Robo taxi race with Las Vegas launch. So they have this offering now of a public Robotaxi service on the Las Vegas strip. Apparently they're offering free rides from select locations with plans to expand citywide. So pretty small test as you might expect, I suppose with just the initial set of testing from Zoox. They are using their very futuristic model of car where you don't have a steering wheel and it's like a tiny kind of bus looking thing where you have all the seats facing inward. It looks great. I would love to try it.
A
I've been seeing people, probably employees and testers in San Francisco riding them. It looks so cool because you're facing each other so you can actually have meetings while you're in the cars, which is very cool.
B
Yeah. And this Zoox, by the way, for those who don't know, was acquired by Amazon back in 2020. They've been working on this since 2014. So Zoox, even though they haven't deployed to the extent that Tesla or Waymo have or haven't demonstrated as much, given their backing and given that they've been in this for a long time, I think they still have a chance to really kind of grow rapidly if this turns out to go well.
A
Yeah. And how fun to have it start in Las Vegas.
B
I know, yeah, I should go try it out. And just two more stories in the section with more funding news. We've got replit hitting $3 billion valuation with 150 million dollar annualized revenue. That's after they raised 250 million in a new funding round. So relet the one of the like key winners of the Vibe coding era, I suppose that started this year, seemed to be growing very rapidly in terms of their revenue and unsurprisingly, I suppose also getting some impressive fundraising as a result.
A
Repl Dot definitely makes it really easy for people to get started on coding, building their own project. And they have definitely done a great job at being able to leverage all the new AI coding tools and integrating it with their platform.
B
Right. And as a result I'm looking at this kind of jumped out on me. Apparently the revenue went from 2.8 million annualized revenue to 150 million in less than a year. And this company has been around since 2016. So replay has been active for a long time as a sort of dev tool for coders, but now having kind of made it usable for non professionals, they're rocketing upward.
A
Well, honestly I am surprised they've been able to raise money with only a 2.8 million AR revenue previously for them to grow this big. But yeah, very exciting. We're definitely seeing a lot of AI tools now being able to go to 100, 150, 200 mil revenue in a very short amount of time. So very exciting.
B
And the last story also on fundraising perplexity, the I guess primarily search tool, now they're trying to expand into agents and browsers and so on. They have reportedly raised 200 million at a $20 billion valuation. And this is just after two months after they raised 100 million at an $18 billion. So one of the very fun things with this podcast is in AI, people just fundraise like constantly every few months these companies are getting billions of dollars if they can. And that is certainly true in this case.
A
Yeah. So also their ARR just hit 200 mil, up from 150 million reported last month. So they're also growing quite, quite a lot in revenue as well.
B
And onto the projects and open source section, just a couple of things here. First one is K2 think a parameter efficient reasoning model. So this is research paper plus an open source model coming from the Institute of Foundation Models from the Mohad Bin Zayed University of Artificial Intelligence in the uae, which I don't think we've covered before, which is interesting. They took an existing model, Qin 2.5.32B as their base model and then they put it through all the typical reasoning training. So they had some fine tuning, some reinforcement, learning all the tricks. They also have best event sampling some of stuff on the inference side package in here and as a result they get 32 billion parameter model which is seemingly performing very impressively according to at least their math results. They are performing better than DeepSeq R1, DipSeek v3.1 GPT OSS at a relatively small number of total parameters. Now this is a little bit unfair because they're not comparing total active parameters, they're comparing total parameters. But nonetheless I think very cool to see even better open source models now on the reasoning side and pretty impressive to see a university publishing this kind of stuff.
A
Yeah and it's also interesting that their way of getting to better results, it's not just more parameters, it's actually thinking about scaling, it's using plan before you think prompt restructuring. We're just seeing this like time to compute, rethinking prompts. Really, really trying to think of it almost as having different agents be able to think through different prompts and surfacing. The best ideas come up as now one of the best ways to improve reasoning. And I think even in large foundation models the time to compute and improving the prompts is so key to getting better model performance.
B
And just one more open source story here. Next one is a benchmark, not a model. The paper that came out of it is called locobench, a benchmark for long context large language models in software engineering. So I assume that's long context software engineering benchmark. The basic point I Make is the existing software engineering benchmarks we have, like SAPE bench and so on, typically deal with GitHub issues and therefore are pretty localized. So you might be working a code base, but the total amount of work, the total amount of files you need to look at, the total amount of code you need to look at is relatively minor. And as a result, the benchmarks don't necessarily correlate too deeply to the performance you get when you actually try to use them via cloud code or via codecs or via any of these tools. So this paper introduces a whole bunch of tasks. So they have eight categories of long context tasks. They have architectural understanding, cross file refactoring, feature, implementation bug investigation, et cetera, et cetera. They have like a thousand of each of these eight things, different difficulty levels in terms of the length of, I think tokens here. So on the low side you have what you typically see in the existing benchmarks of 10k to 100k tokens, but then you scale up to 10x50x100x those kinds of context lengths for their kind of hardest level. And as you might expect, compared to the easier or let's say the shorter existing coding benchmarks, existing systems aren't able to solve these things. As a B bench, I think now we had like 90%, we're like saturating them with these tasks. The existing models are nowhere able to fully resolve them and there's quite a hierarchy in terms of their capabilities as well.
A
Yeah, I think this is great that benchmarks are becoming more and more realistic. That's always so important because when the benchmarks aren't realistic, we end up building what we can measure. And 10k tokens is not at all realistic to the type of coding tasks that people do every day. Even simple things like 10k is not enough if you're trying to work with multiple files and refactor and the context window. Just a lot of people are now just doing a lot of engineering tricks to be able to remember what's happening and so we don't have to use up all the context window. But it's great if we can start measuring how these models can work with longer and longer contexts.
B
They also interestingly introduce some kind of interesting metrics. So they have total of 8 Software engineering excellence metrics, Architectural Coherence Score, Dependency Traversal Accuracy, Cross File Reasoning Depth, System Thinking Score, Robustness Score, Comprehensiveness Score, Innovation Score and Solution Elegance score, all based on, I guess, previous research that suggested variations of these, or at least the last few that are more Dealing with code quality. So overall seems like a very thoughtful effort to make a very useful benchmark that tracks actual software engineering quality.
A
Yeah, hopefully this just means these models will can keep improving on real more realistic tasks.
B
And speaking of continuing to improve onto the next section, research and advancements. The first story is self improving embodied foundation models. And this is coming from Google DeepMind in in collaboration with Generalist, which I don't know what I'm aware of.
A
Oh yeah, Generalist is a robotic company that came out of DeepMind.
B
Oh well there you go. That makes a lot of sense. So in this collaboration they introduced a self improving embodied foundation model. What that means is they begin with something like the RT2 model that came out of DeepMind where they take a whole bunch of video, a whole bunch of rollouts of robotics and train robotics. Foundation model in the sense that you're able to get a robot arm in this case to technically do anything. So give it some text and it'll try to execute a policy to do whatever you want. The self improving part here is after you do the pre training in stage two, you can do online self improvement with an on policy rollout of a robot. So you have ideally one person or maybe two people like supervising actual robots. And in these little cages they have something that is able to evaluate success criteria on whatever tasks they're working on. And as a result you're basically able to generate a continuous stream of success and failure rollouts and at least in the ideal case, creating a larger data set to then train on. And yeah, they implement this with real hardware and show that you're able to do quite a significant improvement on some of these. Language to table Aloha single insertion, real to sim language table all these different evaluations of robotic, let's say arm based tasks.
A
This is very smart because in robotics one of the biggest problems is just we don't have enough data. You want to do imitation learning and behavior cloning, great. Now you have to collect lots of data, either VR headsets or using Aloha to like teleoperate the robot. Having the self improvement basically is almost like simplified reinforcement learning by without needing to do reinforcement learning fully, where you only get supervision from the rewards itself. Now you can just predict the reward function and detect the success and use that to supervise and be able to get more trainable data in order to scale up their models.
B
Yeah, in a way it's almost similar to what people are doing with reasoning models now, which is you pre train your model, you then align it and then you do a bunch of executions and you know, actually doing enforcement learning on the language models with these verifiable rewards. This is kind of that in the robotics domain, which I suppose makes a lot of sense.
A
Yeah. The only difference is with reasoning models you can start out fully self supervised. Here you have to start out with imitation learning and then you can after with enough data you now can improve it with its own self supervision, supervision.
B
And the next research is also about a foundation model, also about I guess physics related foundation model. Although in this case it's not robotics, it's a physics foundation model. So the paper is towards a physics foundation model. I'm going to be honest, it's mostly going to go over my head so I'm not going to be able to go deep in, but it looks pretty impressive. So they frame this as there are existing physics models like physics inform neural networks. They can do various things like estimating thermal flows, solving obstacle flow, shear flow. Yeah, these kinds of things. And they try to create a foundation model in the sense that it's one model to do a whole bunch of stuff. Right. And the way they do that is that they have this G phi T G phi T model that is given a set of states and the states are these kind of spatial temporal patches containing things I guess like state basically. Right. So they have forces, fields, et cetera and they basically just give you a prompt which is a sequence of states. And just from this sequence of states the model is able to then do these various kinds of physics related operations like thermal flows and so on. And they do that and they train it on a diverse set of 1.8 terabyte corpus of simulation data on these wide range of physical systems without explicit physics describing features. So seems pretty impressive. Again, I'm not too caught up on the physics simulation side of research, but pretty cool.
A
It's pretty cool. But it seems like they train mostly on simulation data, so I am curious if they can generalize to real data.
B
Yeah, I guess that would be a key question. But they do compare to these specialized models and apparently it outperforms these specialized architectures unknown task. And also generalizes to other distribution problems. So I guess the hope is you train on enough data, you train on enough varied data and it's going to be able to do quite well. Although I'm sure you're right that it needs to go beyond simulation to really be super reliable. Next we have yet another foundation model. I just decided to make that kind of a theme and also in robotics, but this time instead of arms. It's about legs or wiggles, I suppose you could say. The paper is embodied navigation foundation model. And so navigation is one of these sort of pretty base sort of tasks that's been looked at in research over the past decade. It's kind of what it sounds like. The robot is given a goal place to go to and it needs to make it there, usually by relying on vision. So you can give us to quadrupeds, you can give us to humanoids, robots on wheels. And I typically need to kind of navigate an apartment or some other space to be able to get there. And there's been quite a bit of research for about a decade on doing reinforcement learning, deep learning, all sorts of things like that. So here the researchers have developed navfom, which is a cross task and cross embodiment navigation foundation model. So they have 8 million navigation samples from these different tasks and embodiments where embodiments again can be quadrupeds, humanoids, robots on wheels, and all of these. If you just give it an egocentric video and language instructions, the model is then gonna predict the trajectory that the agent should take to get you to wherever you wanna get. So very useful type of model for where you want, I guess general purpose robotics for instance.
A
I have to be honest, I feel like this is just like publishing for the sake of publishing a foundation model. Right? Like we have pretty good models to do self driving. Like that's why earlier in the episode talked about several self driving car company news. And with these kind of like diversity based foundation models like hey, we can do it on humanoid, hey, we can do it on a car, hey, we can do it on robot wheels. Oftentimes it's really about diversity because if you look at the benchmark, the performance is like at 64.4% which still feels quite low to actually be utilized in a real world setting. So I wonder if for navigation it's still more important to build the models. Probably big foundation models necessary for navigation. But rather than trying to go across different types of platforms, focusing on the specific type of platform.
B
Yeah, if I do try to incorporate autonomous driving and UAV data here, which to your point probably isn't necessary, I think navigation benchmarks typically are more indoors oriented. I guess the key benefit to trying to do this cross embodiment stuff is trying to have something that generalizes, right? So they do say that they are taking in different camera view information, have different temporal context. Maybe if they focus a little bit to not deal with cars and UAVs but more so just different types of embodied agents of different heights and different kind of perspectives. I think that could probably be quite useful. Alrighty. Well, that's it for research. Lots of foundation models. Next on we go to Policy and Safety. First up, we have something in our home state of California. Anthropic has endorsed California's AI safety bill, SB 53. So this is I believe, the kind of follow up version of regulation that was being discussed earlier that was passed by VetoeD by the Governor of California. This is a tweaked version that took out some of the, let's say more onerous requirements. And Anthropic explicitly endorsing it is pretty significant sign that they think that this is a good way to regulate for AI safety. And SB 53 is an AI safety bill that is meant to regulate basically Anthropic regulate companies working on advanced AI models that might contribute to risks such as biological weapons or cyber attacks. So as with the previous version of this bill, it passing or not passing would be a pretty big deal. I'm sure OpenAI would not be very happy if it passes, but it probably has a better chance than its predecessor.
A
Anthropic seems to be always on the forefront of really arguing for more safety, but I am surprised that they are going after regulatory efforts too to improve safety, as it does mean that there will be more requirements and legal requirements for people to innovate on models.
B
Yeah, according to this article and some, I guess policy experts are saying that this is a more restrained approach compared to previous AI safety bills. So this could be a good at least according to it seems to be like the right way to do it. They have this quote in their blog post. The question isn't whether we have AI governance, is whether we develop in a thoughtfully today or reactively tomorrow. SB53 offers a solid path toward the former. So the basic point is according to Anthropic, this is a good way to do this kind of regulation. Next up, moving away from AI safety to copyright stuff. Another popular topic for legal battles. This time we have Warner Bros. Suing Midjourney. So Warner Bros. Is filing a lawsuit against Midjourney accusing them of copyright violations related to things like Superman, Batman, Bugs Bunny. The complaint alleges that Midjourney has removed safeguards that previously prevented users from creating infringing videos and has resulted in unauthorized creation of Batman and so on. They have also the team in charge here has also filed lawsuits to Midjourney on behalf of Disney and Universal. So sounds like it's more of what Midjourney is already facing.
A
Yeah, well, it does seem like midjourney, compared to a lot of other image generation platforms doesn't really have as many safeguards against intellectual property violations. But it's also interesting that all these companies are now kind of jumping in and dogpiling and going against midjourney.
B
Yeah, I think it's because it's a pretty straightforward thing to do. And as with the previous lawsuits here, if you go and read the PDF, the actual complaint, it's kind of a fun one to read just because there are image attachments and so they have examples of Batman and Superwen and Wonder Woman and Scooby Doo and all these characters as images generate from a journey in this lawsuit, which is certainly fun to see. And let's just do one last story. This is a bit of a shorter episode. The last one also deals with lawsuits and copyright, but this is now in the text domain. The company doing the lawsuit is Rolling Stones and they are suing Google over AI overview summaries. So the lawsuit is claiming that Google AI overview, the panel displays summaries that discourage users from clicking through to the full articles, which would impact a publisher's ad and subscription revenue. Similar to what Perplexity has been dealing with, I suppose. Now Google kind of doing the same thing as Perplexity and giving you this kind of AI summary of a bunch of sources and I guess was just a matter of time till Google had to address this. There's details here that apparently publishers like DMG Media and others have reported significant declines in click through rates since the introduction of AI overviews. Pew Research found that users are less likely to click through to articles when AI summaries are present in search results. So not a trivial matter. I mean, this is kind of live or die for these kinds of publishers, right?
A
And I love how Google denies these claims. But if you actually ask Gemini if AI overviews result in less traffic, it actually contradicts Google's public stance and says yes, it does actually reduce traffic.
B
Right? And publishers are in a tough spot here because they need Google, right? They need to be indexed by Google, they need the traffic generated by Google. But on the other hand, now Google is cannibalizing on that business, on the clicks. So it's tricky balance to strike. And another kind of interesting question on the legal dynamics, the financial dynamics of kind of an LLM driven world, as with image generation now with search, text, publishing, all of this is somehow still not resolved.
A
I mean, look, this is very disruptive technology, so a lot of old business models are just going to be disrupted. And I mean, publishing has been very much hurt by the Internet as well. So this is another wave of potential less revenue, less clicks for these publishers. So I can see why they are trying to figure out a way to salvage the situation.
B
Well, we'll finish it with that slightly sad detail, although robotic stuff hopefully made up for it. Thanks once again, Michelle, for guest hosting.
A
Yeah, it was fun. It was fun to talk about the latest AI news with you, Andrej. Thank you so much for inviting me.
B
Yeah, maybe we'll do it again. We'll see. And thank you also to the listeners as usual for tuning in. Apologies once again for not being very consistent. Last week in AI is supposed to be every week but sometimes it's not. Please do keep tuning in. Tune in, Tune in when the AI news begins begin.
C
Break it down. Last weekend AI come and take a ride get the low down on tech Canada it's live Last weekend AI come and take a ride up a labs to the streets AI's reaching high new tech emergent watching surgeon fly from the labs to the streets AI's reaching high algorithm shaping up the future sees Tune in tune and get the latest with ease Last weekend AI come and take a ride Hit the low down on that and let it slide from neural nets to robot the headlines pop data driven dreams they just don't stop Every breakthrough, every code unwritten on the edge of change with excitement we're smitten from machine learning marvels to coding kings Futures unfolding see what it brings.
Date: September 23, 2025
Hosts: Andrey "Andrej" Karpathy & Michelle Lee
Theme: Weekly round-up of the most impactful AI news, including new tools, product launches, key business maneuvers, research breakthroughs, robotic advancements, and AI policy updates.
This episode covers a vibrant week in AI, from major updates in AI tools (like OpenAI Codex and Google Gemini), developments in both humanoid robotics and Robotaxi deployments, hot business news of fundraising and strategic deals, advances in open-source models and benchmarks, and policy and legal developments — including California’s newest AI bill and high-profile copyright lawsuits. Michelle Lee joins as a guest host while Jeremy is away.
The conversation is lively, slightly irreverent, and accessible while maintaining technical rigor. The co-hosts blend personal anecdotes (e.g., about using Waymo or robot arms in labs) with dry, sometimes wry commentary on AI industry drama and rapid progress. Technical explanations are interspersed with business and policy implications, always in plain language.
This episode is a rapid, balanced ride through the week’s AI news—capturing not just which headlines to know, but the why and the potential “what's next.” From tool launches and robotics hardware to open-source innovations, big-money deals, and emerging policy fights, #221 offers a deep-yet-digestible look at the evolving world of AI.