Transcript
A (0:00)
Today on the AI Daily Brief are world models AI's next big thing? Before that in the headlines, how a voice marketplace shows how industries are going to end up collaborating with AI. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick notes before we dive in. First of all, thank you to today's sponsors, kpmg, rovo, Robots and Pencils and Blitzy. To get an ad free version of the show, go to patreon.com aidaily brief or subscribe on Apple Podcasts. If you are interested in sponsoring the show, send us a note at SponsorsIDailyBrief AI. In fact, you can head to AIDAILYBrief AI to learn anything about the show, including seeing any job opportunities, figuring out how to reach out if you want me to come speak, and of course finding a link to our AI ROI benchmarking study. It is live now. You can find out more about that at roisurvey AI. Welcome back to the AI Daily Brief Headlines edition. All the daily AI news you need in around five minutes. Today we kick off with a story that is interesting on its own terms, but is also reflective of a broader pattern of how I think different industries are going to ultimately interact with AI. Which is of course to say that they are going to, rather than fight it, ultimately collaborate with it. We're talking of course, today about 11 labs launching a new voice marketplace that will allow users to access celebrity voices, including Michael Caine, Judy Garland and John Wayne. The Iconic Voices Marketplace, that's what it's called, will allow companies to partner with 11 labs to license the celebrity voices for content and advertisements. So far, the list of 28 available voices is largely concentrated around historical figures and deceased entertainers. Mark Twain, Alan Turing and Thomas Edison are available. Although it's questionable how much people would recognize their voices, the mainstays of the list are deceased celebrities with iconic voices like Maya Angelou and Burt Reynolds. Eleven Labs suggested the service, which places them as the middleman in the licensing deals, addresses many of the critiques around AI content They call their marketplace the quote, consent based performer first approach the industry has been calling for. One of the first living celebrities who lent his voice to the service is Michael Caine. In a statement, he said, for years I've lent my voice to stories that moved people, tales of courage, of wit, of the human spirit. Now I'm helping others find theirs. With 11 labs we can preserve and share voices, not just mine, but anyone's. He continued that the company is, quote, using innovation not to replace humanity but to celebrate it and that it's quote, not about replacing voices, it's about amplifying them. There is an aspect of legacy in his statement that maybe suggests that he views this more as a legacy preservation technology as opposed to just a new revenue stream. Matthew McConaughey, while an investor in Eleven Labs, is going a little slower, allowing the company to translate his newsletter lyrics of Living into a Spanish language audio version that uses his voice. Between this new marketplace, the recent settlement between UMG and udio, I think we might be at the beginning phases of the commercialization of IP for AI, as opposed to just outright fighting AI. I think the more examples of people they have who are actually still living and providing their own consent, the better. I think for a lot of folks, even though everything is legally on the up and up with the deceased celebrities, there's going to be a lingering feeling like they couldn't actually consent to this in life. That may diminish people's enthusiasm Moving over to Markets the big conversation yesterday was all about SoftBank dumping their Nvidia stock to fund their investments in OpenAI. In this week's quarterly results, SoftBank disclosed that they've sold all 32.1 million of their Nvidia shares for around $5.8 billion. The sale appears to confirm what was already pretty clear from recent reporting that SoftBank is reaching deep into their pockets in order to fund their $30 billion commitment to OpenAI this year. The final 22.5 billion is due to be paid in December after OpenAI successfully converted to a for profit corporation. SoftBank has issued several billion dollars in bonds and borrowed 5 billion against their arm stock in order to fund the deal. Now. For some, this is another indication of the AI bubble bursting, although it's pretty clear that this is just SoftBank digging in the couch cushions to finance its commitments to OpenAI. It's also worth noting that SoftBank CEO Masayoshi Son is possibly the worst Nvidia trader on the planet. He owned a substantial stake in the company for many years, but sold it all in 2019, ultimately missing out on $100 billion in gains if he had held on for just a few more years. SoftBank started buying small tranches of the stock again in 2020, but the bulk of their investment came in March of this year. Gavin Baker, the managing partner at Atreides Capital Management, said someone should look into what happened after SoftBank sold their entire 4.9% stake in Nvidia back in 2019, although most analysts don't think Masasan was really calling the top on the AI bubble. Nvidia stock was still down 3% on the day and Softbank itself took a beating, losing 10%. Staying on the theme of OpenAI and its big capital needs, the company's project Stargate has received 3 billion in new investment from Blue Owl Capital. Blue Owl is one of the largest private capital firms in the US and has recently moved aggressively into data center development. They announced a string of data center funding over the past two months, including contributing 7 billion to a Meta facility in Louisiana. The firm now has over a thousand people working on designing, building and operating data centers within a new division called Stack Infrastructure. The Stargate deal in this case relates to a data center being constructed in New Mexico in collaboration with Oracle. Blue owl will contribute 3 billion in equity while a group of banks will put together 18 billion in debt funding. Meanwhile, AMD CEO Lisa Su said that her company could be able to carve out a double digit market share in data center chips over the next three to five years. Speaking at a company event on Tuesday, Su said that she anticipates average revenue growth of 35% over that period, holding the current pace. However, she expects to see the data center business grow at 60%, driven by, in her words, insatiable demand for AI chips. She said, this is what we see as our potential given the customer traction both with the announced customers as well as customers that are currently working very closely with us now. Next year will serve as a litmus test for AMD as they attempt to take Nvidia on directly. AMD has released a handful of GPUs suitable for AI workloads over recent years, but so far none of them have captured a significant slice of the market in recent financial reports. However, AMD has boasted of strong growth in their data center sales, but they haven't split out GPU sales from their CPUs. AMD will be pushing their own rack scale deployments next year with the release of the Mi 400X chips. The servers will contain 72 chips, which is crucial to run the largest AI models. OpenAI recently committed to deploying a gigawatt of AMD's new chips and the company also has long term deals with Oracle and Meta. One interesting one that has some people scratching their heads. Meta AI has apparently seen a surprising surge in users over the past month. According to similar web data, the Meta AI Web app saw 105% growth in traffic between September and October. That made them by far the fastest growing AI web app for the month, easily outpacing perplexity at 29% and Claude at 25% growth. Even on a full year basis, Meta is doing surprisingly well now. Gemini is the breakout hit of the year with 305% traffic growth, but Meta is right there in second place with 149%. By comparison, traffic to chatgpt.com grew by 68% for the year now. There were two main explanations offered. Either this is huge growth off a tiny baseline, making it less impressive than it seems at first glance, or Metasauru competitor Vibes in late September, which seems to be a sleeper hit. Traffic numbers don't seem to have been that low heading into October, and you can see a huge spike in app downloads that corroborates the success of Vibes. Still, for all of this, many are skeptical that it's a sustainable trend, Chevy wrote, who used Meta AI intentionally and for a specific purpose? I don't know man. I think it's easy to be skeptical, but at the same time it's very possible to me that the terminally online AIX community might be in a huge bubble when it comes to understanding what normal people actually use and like. Is it possible that having a free and open version of Sora has really benefited Meta in ways that the hardcore AI community just isn't appreciating? Seems totally possible anyway, something to keep an eye on, but for now that is going to do it for today's headlines. Next up, the main episode what if AI wasn't just a buzzword but a business imperative? On you can with AI, we take you inside the boardrooms and strategy sessions of the world's most forward thinking enterprises. Hosted by me, Nathaniel Whittemore and powered by kpmg, this seven part series delivers real world insights from leaders who are scaling AI with purpose. From aligning culture and leadership to building trust, data readiness and deploying AI agents. Whether you're a C suite, executive strategist or innovator, this podcast is your front row seat to the Future of Enterprise AI. So go check it out at www.kpmg.us aipodcasts or search you can with AI on Spotify, Apple Podcasts or wherever you get your podcasts. Meet Rovo, your AI powered teammate. Rovo unleashes the potential of your team with AI powered search, chat and agents or build your own agent with Studio. Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and secure platform, so it's always working in the context of your work. Connect Rovo to your favorite SaaS app so no knowledge gets left behind. Rovo runs on the Teamwork graph, Atlassian's intelligence layer that unifies data across all of your apps and delivers personalized AI insights. From day one. Rovo is already built into Jira Confluence and Jira Service Management Standard, Premium and Enterprise subscriptions. Know the feeling when AI turns from tool to teammate? If you Rovo, you know Discover Rovo, your new AI teammate powered by Atlassian get started at ROV as in victory o.comai changes fast. You need a partner built for the long game. Robots and pencils work side by side with organizations to turn AI ambition into real human impact. As an AWS Certified Partner, they modernize infrastructure, design cloud, native systems and apply AI to create business value. And their partnerships don't end at launch. As AI changes, robots and pencils stays by your side so you keep pace. The difference is close partnership that builds value and compounds over time. Plus, with delivery centers across the us, Canada, Europe and Latin America, clients get local expertise and global scale. For AI that delivers progress, not promises, visit robotsandpencils.com aidaily Brief this episode is brought to you by Blitzy, the enterprise autonomous software development platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzi platform, bringing in their development requirements. The Blitzi platform provides a plan, then generates and pre compiles code for each task. Blitzi delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public Companies are achieving a 5x engineering velocity increase when incorporating Blitzi as their pre IDE development tool, pairing it with their coding pilot of choice. To bring an AI Native SDLC into their org, visit blitzi.com and press get a demo to learn how Blitzy transforms your SDLC from AI Assisted to AI Native welcome back to the AI Daily Brief. Today we have one of those interesting times when two pieces of seemingly unrelated news might actually be telling part of the same story. On the one hand, we have Meta's chief AI scientist Yann LeCun reportedly leaving Meta and launching his own company. And on the other hand, we have a new essay from Dr. Fei Fei Li, all about world models and spatial intelligence. At the root of both of these stories is a question about whether the right path to the next generation of advanced AI is via large language models, or if indeed there is A different approach. Let's talk first though, about Yann Lecun leaving Meta. This is something of the end of an era. LeCun has served as Meta's chief AI scientist since 2013. In other words, long before anyone was trying to catch up to ChatGPT, even before there was an OpenAI bombing around the valley thinking about the future of artificial intelligence, Vicun drove the development of Meta's early llama models through his leadership at Meta's Fundamental AI Research, or FAIR Lab. LeCun is a highly decorated AI scientist, having won a Turing Award in 2018 for pioneering work done in the 1990s and 2000s. And yet for many, the writing for this was on the wall as soon as Zuckerberg began his hiring spree over the summer to establish Meta's Superintelligence division and TBD labs, and especially since he bought a big chunk of scale in order to recruit 28 year old Alexander Wang to run the new initiative. Now, in all these moves, Lecun's FAIR lab did continue to exist, but it was rolled up within the new division led by Alexander Wang. Now, as part of the move, Wang became the chief AI officer while former OpenAI lead scientist Zhangjia Zhao became chief Scientist for Superintelligence Labs. At the time, LeCun reaffirmed his role at the company publicly, writing, my role as chief scientist for FAIR has always been focused on long term AI research and building the next AI paradigms. My role and fair's mission are unchanged. In other words, the party line at the time was that Lacun had always been concerned about longer term building rather than immediate term initiatives, and this didn't change that. And yet rumors swirled of resources and personnel being drained from fare, as well as a broader shift away from pure research towards AI that could be commercialized. For some commentators, this is just the latest in a string of personnel issues surrounding Meta's AI. Didi Das of Menlo Ventures writes, meta's AI Org is in disarray. First Sumitra Chinchala, the inventor of Pytorch, leaves, now yann LeCun, their AI head leaves. They have 600 billion in compute commits until 2028, but I guess it's up to Alex Wang and Nat Friedman to deliver. Computer science professor Pedro Domingos noted that the news wiped 30 billion off a Meta's market cap, approximately twice of what they paid to get Alexander Wang. Andrew Orlowski of the Daily Telegraph posted, zuck basically hired Jan Yang, the hot dog, not hot dog guy from Silicon Valley, and made Lecun report to him. I'm surprised he took so long to bail, but very underreported. Zuck hasn't a clue what he's doing yet. That was far from the only take. For some. This still feels like the natural fallout of of adjustments in personnel when the stakes are this large. Jordan Thibodeau, formerly of Google and Slack, responded to Dee Dee Das saying, bro, come on, you've been around the block. Anytime a regime change happens, reorgs and exits happen. You gotta give the story time to bake before jumping to conclusions. Of course the old regime is leaving. They did well during the status quo, but now it's all hands on deck and Facebook is under wartime and not many in the AI community are up for that. Others basically just thought that it was probably time to rip the band aid. Lecun has not only been away from day to day duties for a while, he's been vocally against LLMs as a stepping stone towards AGI. You might remember that big Wall Street Journal piece from about a year ago where he very famously said that current AIs were dumber than a cat. Brasserx writes Yann Lecun leaving Meta is significant and probably overdue. He's a foundational figure in AI, but his research first mindset often put Meta out of sync with the pace of the current landscape. While competitors pushed aggressively towards large scale product ready models, Meta spent years debating theory with Lecun. Moving on, Meta now has room to align its AI strategy with reality rather than philosophy. Less nostalgia, more execution. Others thought. Honestly, as smart as Lacune is that he just hasn't been the asset that he needed him to be. When it came to Meta competing in the AI race, John Hernandez writes, we all saw this coming, didn't we? First, if you're a big name on AI, anything you do will raise several billion overnight. Hard to get that much money on a salary. Second, if you are a legend and they make your report to a kid that could be your grandson, no matter how good he is, you won't feel appreciated. But truth be told, he hasn't helped Meta much on the AI race. So it's a win win. Jeffrey Emanuel writes, Yann Lecun is better off working in a Bell Labs or Xerox park setting where things are measured in decades and there's no expectation or pressure to deliver anything commercially useful in the short or even medium term. Meta is way past that now, given their AI capital spending. The point that Jeffrey's making is that Zuckerberg is betting the farm on a huge infrastructure build out, and that's going to force them to live in the real world of what they can deliver right now, emmanuel continues, I get the sense that he doesn't care enough about winning in the marketplace or about products to make a compelling startup now, given the intense level of competition. Also, LLMs are the tech we have now and he doesn't believe in them long term. Startups need to move fast. I think John's note, however, that if you have one of those big names you can raise a ton of money very quickly, is well taken. A cynical take on this that by starting his own new lab, LeCun is basically locking in a $2 or $3 billion hiring bonus when eventually that lab gets bought by Google DeepMind. And that might especially be the case if his interest in world models really starts to bear fruit now, even within Meta's fair lab, LeCun and his team have taken some steps towards working on that, but the Financial Times piece suggests that he's going to go much farther with this new startup effort, they write. Within FAIR, LeCun has instead focused on developing an entirely new generation of AI systems that he hopes will power machines with human level intelligence known as world models. These systems aim to understand the physical world by learning from videos and spatial data rather than just language. Lecun has said it could take a decade to fully develop the architecture. Lacun's next endeavor is focused on furthering his work on world models, according to two people familiar with the matter. Which brings us to our second companion story, which is not so much a story as this essay and accompanying Twitter post from Fei Fei Li on x. She writes, AI's next frontier is spatial intelligence, a technology that will turn seeing into reason, perception into action, and imagination into creation. The essay she released is called From Words to Worlds. Spatial intelligence is AI's next frontier. In it, she calls large language models wordsmiths in the dark, eloquent but inexperienced, knowledgeable but ungrounded. Instead, she says, quote, spatial intelligence will transform how we create and interact with real and virtual worlds, revolutionizing storytelling, creativity, robotics, scientific discovery and beyond. This, she says, is AI's next frontier. So what does she mean by spatial intelligence now? First of all, it should be noted that whereas Lacune is Rather dismissive of LLMs, Fifi Li is less so. She writes, it's no longer a question of whether AI will change the world by any reasonable definition, it already has. Yet, she points out, many of the big dreams and visions that we have for AI lie outside of our reach for example, she says, the dream of massively accelerated research in fields like disease curation, new material discovery, and particle physics remains largely unfulfilled. And the promise of AI that truly understands and empowers human creators remains beyond reach. She writes. To learn why these capabilities remain elusive, we need to examine how spatial intelligence evolved and how it shapes our understanding of the world. Going back to the history of evolution, she suggests that long before we could communicate with language, the quote, simple act of sensing quietly sparked an evolutionary journey toward intelligence. She continues, the seemingly isolated ability to glean information from the external world, whether a glimmer of light or the feeling of texture, created a bridge between perception and survival that only grew stronger and more elaborate as the generations passed. Layer upon layer of neurons grew from that bridge, forming nervous systems that interpret the world and coordinate interactions between an organism and its surroundings. Thus, many scientists have conjectured that perception and action became the core loop driving the evolution of intelligence and the foundation on which nature created our species. She goes on to point out all of the various ways in which spatial intelligence impact our everyday lives, but points out that it's not just functional, but also at the root of our creativity. Ultimately, she writes, spatial intelligence is the scaffolding upon which our cognition is built. It's at work when we passively observe or actively seek to create. It drives our reasoning and planning, even on the most abstract topics. And it's essential to the way we interact verbally or physically with our peers or with the environment itself. Unfortunately, today's AI doesn't think like this. Yet now you might be thinking, what then about multimodal LLMs. She writes that while they've had some progress, there are still real limitations. Multimodal LLMs, she writes, trained with voluminous multimedia data in addition to textual data, have introduced some basics of spatial awareness. Today's AI can analyze pictures, answer questions about them, and generate hyper realistic images and short videos. Yet the candid truth is that AI spatial capabilities remain far from human level and the limits reveal themselves quickly. State of the art MLLM models rarely perform better than chance on estimating distance, orientation and size, or mentally rotating objects by regenerating them from new angles. They can't navigate mazes, recognize shortcuts, or predict basic physics. AI generated videos, nascent and, yes, very cool, often lose coherence after a few seconds. And ultimately, while this doesn't make LLMs not useful for the use cases that they're useful for, Lee argues that there is a whole world of use cases that are just outside of AIs. Capabilities. She provides, then, three essential capabilities that will define world models. The first is generative world. World models can generate worlds with perceptual, geometrical, and physical consistency, she writes. World models must be capable of spawning endlessly varied and diverse simulated worlds that follow semantic or perceptual instructions while remaining geometrically, physically, and dynamically consistent. Next is multimodal world models. Being multimodal by design, she writes, just as animals and humans do, a world model should be able to process inputs known as prompts in the generative AI realm in a wide range of forms. Given partial information, whether images, videos, depth maps, text, instructions, gestures, or actionsworld, models should predict or generate world states as complete as possible. Third and finally is interactive World models can output the next states based on input actions. Finally, she says, if actions and or goals are part of the prompts to a world model, its outputs must include the next state of the world represented either implicitly or explicitly. Together, she says, the scope of this challenge exceeds anything AI has faced before. While language is a purely generative phenomenon of human cognition, worlds play by much more complex rules Here on Earth, gravity governs motion, atomic structures determine how light produces colors and brightness, and countless physical laws constrain every interaction. Even the most fanciful creative worlds are composed of spatial objects and agents that obey the physical laws and dynamic behaviors that define them. Reconciling all of this consistently the semantic, the geometric, the dynamic, and physical demands entirely new approaches. The dimensionality of representing a world is vastly more complex than that of a one dimensional sequential signal like language, which of course creates a whole set of new challenges. Some of the research topics at her world labs include developing a new universal task function for training, figuring out how to extract deeper spatial information from two dimensional image or film based training data, and creating new model architectures and representational learning. The payoffs, however, she argues, are immense. In addition to new superpowers around creativity and creating new types of immersive gaming or visual experiences, there are the implications for robotics, which she calls embodied intelligence in action. Even beyond those things, though, she argues that the real unlock for many use cases in science, healthcare, and education will come from this sort of spatial intelligence. For example, she writes in Healthcare, spatial intelligence will reshape everything from laboratory to bedside AI can accelerate drug discovery by modeling molecular interactions in multi dimensions, enhance diagnostics by helping radiologists spot patterns in medical imaging, and enable ambient monitoring systems that support patients and caregivers without replacing the human connection that healing requires. I think the point of all of this is just a reminder that as locked as we are in this current paradigm of LLMs, there are other paths to advanced AI. And while some will view Yann LeCun's departure as the inevitable byproduct of personnel changes, I think Dr. Li's essay reminds us that there are reasons that someone of Lacune stature would want to go work on something different if he does start a new world model focused lab and gets billions of dollars. Frankly, I think relative to a lot of the things that we're spending AIVC on, we could do a lot worse. For now, that's going to do it for today's AI Daily brief. Appreciate you listening or watching as always. And until next time, peace.
