
Hosted by Ravid Shwartz-Ziv & Allen Roush · EN

We host Tal Linzen, Associate Professor at NYU and Research Scientist at Google, for a conversation on the intersection of cognitive science and large language models.We discussed why children can learn language from around 100 million words while LLMs need trillions, and the surprising finding that as models get better at predicting the next word, they become worse models of how humans actually process language. Tal walked us through how his lab uses eye-tracking and reading-time data to compare model behavior to human behavior, and what that reveals about prediction, working memory, and the limits of current architectures.We also got into nature versus nurture and how inductive biases can be instilled by pre-training on synthetic languages, world models and whether transformers actually use the geometric structure they encode, the BabyLM challenge and data-efficient language learning, and what mechanistic interpretability can offer cognitive science beyond just fixing model bugs. The conversation closed on academia versus industry, the role of PhDs in the current AI moment, and how AI coding tools are changing the way Tal teaches and evaluates students at NYU.Timeline00:13 — Intro and what cognitive science means02:16 — Using computational simulations to understand how humans learn language05:26 — How children learn language vs. how LLMs are pre-trained07:53 — Why mainstream LLMs are not good models of humans 10:07 — Comparing humans and models with eye-tracking and reading behavior13:52 — Sensory modalities, smell, and how much you can learn from language alone16:03 — Animal cognition and decoding animal communication17:00 — Nature vs. nurture, inductive biases, and what transformers can and can't learn21:21 — Instilling inductive biases through synthetic languages 27:34 — The bouba/kiki effect and cross-linguistic sound symbolism28:33 — Latent causal structure in language and whether models discover it31:13 — Does knowing linguistics help build better models?35:07 — World models: what they mean, and why transformers encode geometry but don't use it39:13 — Tokenization, and why Tal doesn't like it41:35 — Scaling laws and the inverse-U curve of model quality vs. human fit44:34 — Where the human–model mismatch comes from: architecture, memory, and data47:08 — Diffusion language models and sentence planning48:21 — Data quality, synthetic data, and curriculum effects50:54 — Comparing models at different training stages to human development; BabyLM54:40 — What level of the model should we actually probe? Representations vs. behavior1:01:04 — Mechanistic interpretability, Deep Dream, and human dreaming1:02:11 — Cognitive neuroscience, intracranial recordings, and working memory1:10:31 — Should you still do a PhD in 2026?1:12:31 — Will software engineers lose their jobs to AI?1:17:43 — Teaching in the age of coding agents: what changes in the classroom1:20:54 — What's next: human-like LLMs as user simulators, and recruitingMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

We host Chieh-Hsin (Jesse) Lai, Staff Research Scientist at Sony AI and visiting professor at National Yang Ming Chiao Tung University, Taiwan, for a conversation about diffusion models, the technology behind tools like Stable Diffusion, and most of the AI image and video generators you've seen in the last few years. Jesse recently co-authored The Principles of Diffusion Models with Stefano Ermon, and the book is quickly becoming a go-to reference in the field.We start with what a generative model actually is, and what it means to "generate" an image or a sound. Jesse explains the core idea behind diffusion in plain terms. You start with pure noise, and a neural network gradually cleans it up, step by step, until a realistic image emerges.From there, we talk about why diffusion has come to dominate so much of generative AI. Because the model builds an image gradually, you can guide it along the way, nudging the output toward what you actually want, refining details, or combining it with other controls. We also discuss the common critique that diffusion is slow and how the field has largely addressed it through new techniques.We zoom out to the bigger picture, too. Jesse shares his view on world models and whether diffusion is the right foundation for them. We talk about what makes a generative model genuinely good versus just good at gaming benchmarks, and why evaluating creativity and realism is so much harder than scoring a multiple-choice test.Timeline00:12 — Intro and welcoming Jesse00:47 — Why Jesse wrote the book, and who it's for03:29 — The three families of diffusion models, and why they're really one idea05:14 — What makes a good generative model07:39 — How do you even measure if a generated image is good08:59 — Why diffusion beats autoregressive models for images10:33 — Is diffusion still slow? How fast generation got fast11:12 — A simple intuition for what a "score" is14:12 — How the different flavors of diffusion connect under the hood14:42 — Diffusion for text and proteins17:12 — Consistency models and the push for one-step generation22:12 — Diffusion for world models: simulating reality in real time26:12 — Do world models need to understand language35:12 — Is diffusion the right tool, or just a convenient one38:12 — What benchmarks actually tell us, and what they miss46:12 — Closing thoughts and where to find the bookMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

We talked with Christian Szegedy, co-inventor of Inception and Batch Normalization, founding scientist at xAI, now at Math Inc, about what it takes to build a frontier lab, and why he left xAI to work on formal mathematics. Christian thinks Lean and auto-formalization are the missing piece for trustworthy AI: a machine-checkable layer underneath all reasoning, where proofs are guaranteed correct without anyone having to read them.We got into his bet with François Chollet that AI will hit superhuman mathematician level by 2026, and what that actually unlocks beyond math itself: verified software instead of vibe-coded apps that break when you refactor, AI systems you can actually trust because their reasoning is checkable, and a path to handling protein folding, chemistry, and parts of biology with real guarantees instead of hand-waving. Christian also walked us through how Math Inc's Gauss system pulled off a proof in two weeks that human experts had estimated would take another year.We also covered xAI's first 12-person year, why Christian no longer buys the original batch normalization story, why he's sure transformers won't be the dominant architecture in five years, what mathematicians do in a world of cheap proofs, and his take on whether humanity will handle AI well. He distrusts humanity more than he distrusts AI.Timeline00:12 — Intros: Christian's background (Inception, Batch Norm, xAI, Math Inc)01:29 — Building a frontier lab from scratch: the first 12 people at xAI04:15 — Hiring for proven track records when 200K GPUs are at stake06:07 — Elon's "dependency graph" and balancing long-term vision with investor demos07:28 — Gauss formalizes the strong prime number theorem in 2 weeks12:25 — What "formalization" actually means (and why it's not what most people think)14:39 — Why Lean gives 100% certainty and why that matters for RL15:26 — ProofBridge and joint embeddings across mathematical subfields 18:07 — Does math formalization transfer to coding and other fields?21:44 — Can every domain be mathematized? 23:14 — Verified software, chip design, and why vibe-coded apps are dangerous26:35 — Scaling Mathlib by 100–1000x28:27 — Artisan formalizers vs. invisible machine-language formalists33:26 — Can verification generalize?45:19 — Revisiting Batch Norm: covariate shift, loss landscape, and what really happens48:22 — Is normalization even necessary? 50:10 — What's actually fundamental in modern AI architectures51:41 — Why Christian thinks transformers won't last 5 years52:38 — The 2026 superhuman AI mathematician bet55:15 — What's missing: better verification + a much larger formalized math repository56:13 — Lean vs. Coq vs. HOL Light - does the proof assistant actually matter?59:26 — The role of mathematicians in 5–10 years1:02:00 — A human element to mathematics: Newton, Leibniz, and competitive proving1:03:25 — The telescope analogy: AI as the instrument that lets us see the math universe1:05:19 — Job apocalypse or Jevons paradox? 1:08:41 — Advice for students1:09:50 — Can we formally verify AI alignment? 1:11:52 — Closing thanksMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

We sat down with Rao Kambhampati, a Professor of CS at Arizona State University and former President of AAAI, to talk about reasoning models: what they are, when they work, and when they break.Rao has been working on planning and decision-making since long before deep learning, which makes him one of the most grounded voices on what today's reasoning systems actually do. We start with definitions of what reasoning is, why planning is the hard subset of it, and what changed when systems like o1 and DeepSeek R1 moved the verifier from inference into post-training. From there we get into where these models generalize, where they don't, and why benchmarks can be misleading about both.A big chunk of the conversation is on chain-of-thought: what intermediate tokens are actually doing, why they help the model more than they help the reader, and what outcome-based RL does to whatever semantic content was there to begin with. We also cover world models and why Rao thinks the video-only framing is the wrong bet, the difference between agentic safety and existential risk, and what the planning community figured out decades ago that the LLM community keeps rediscovering.Timeline(00:12) Intros(01:32) Defining "reasoning" and the System 1 / System 2 framing(04:12) Blocksworld vs Sokoban, and non-ergodicity(06:42) Pre-o1: PlanBench and "LLMs are zero-shot X" papers(07:42) LLM-Modulo and moving the verifier into post-training(10:12) Is RL post-training reasoning, or case-based retrieval?(13:12) τ-Bench and benchmarks that avoid action interactions(14:12) OOD generalization and what we don't know about post-training data(19:02) Does it matter how they work if they answer the questions we care about?(21:27) Architecture lotteries and why no one tries different designs(23:42) Intermediate tokens and the "reduce thinking effort" cottage industry(26:12) The 30×30 maze experiment(27:42) Sokoban, NetHack, and Mystery Blocksworld(34:58) Stop Anthropomorphizing Intermediate Tokens — the swapped-trace experiment(46:12) Latent reasoning, Coconut, and why R0 beat R1(50:12) How outcome-based RL erodes CoT semantics(52:12) Dot-dot-dot and Anthropic's CoT monitoring paper(53:42) Safety: Hinton, Bengio, LeCun(57:12) Existential risk vs real safety work(59:42) World models, transition models, and video-only approaches(1:03:12) Why linguistic abstractions matter — pick and roll(1:05:42) What the planning community knew in 2005(1:08:12) Multi-agent LLMs(1:09:57) Closing thoughts: the bridge analogyMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

In this episode, we hosted Zhuang Liu, Assistant Professor at Princeton and former researcher at Meta, for a conversation about what actually matters in modern AI and what turns out to be a historical accident.Zhuang is behind some of the most important papers in recent years (with more than 100k citations): ConvNeXt (showing ConvNets can match Transformers if you get the details right), Transformers Without Normalization (replacing LayerNorm with dynamic tanh), ImageBind, Eyes Wide Shut on CLIP's blind spots, the dataset bias work showing that even our biggest "diverse" datasets are still distinguishable from each other, and more.We got into whether architecture research is even worth doing anymore, what "good data" actually means, why vision is the natural bridge across modalities but language drove the adoption wave, whether we need per-lab RL environments or better continual learning, whether LLMs have world models (and for which tasks you'd need one), why LLM outputs carry fingerprints that survive paraphrasing, and where coding agents like Claude Code fit into research workflows today and where they still fall short.Timeline00:13 — Intro01:15 — ConvNeXt and whether architecture still matters06:35 — What actually drove the jump from GPT-1 to GPT-308:24 — Setting the bar for architecture papers today11:14 — Dataset bias: why "diverse" datasets still aren't22:52 — What good data actually looks like26:49 — ImageBind and vision as the bridge across modalities29:09 — Why language drove the adoption wave, not vision32:24 — Eyes Wide Shut: CLIP's blind spots34:57 — RL environments, continual learning, and memory as the real bottleneck43:06 — Are inductive biases just historical accidents?44:30 — Do LLMs have world models?48:15 — Which tasks actually need a vision world model50:14 — Idiosyncrasy in LLMs: pre-training vs post-training fingerprints53:39 — The future of pre-training, mid-training, and post-training57:57 — Claude Code, Codex, and coding agents in research59:11 — Do we still need students in the age of autonomous research?1:04:19 — Transformers Without Normalization and the four pillars that survived1:06:53 — MetaMorph: Does generation help understanding, or the other way around?1:09:17 — WrapMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

We talked with Sasha Rush, researcher at Cursor and professor at Cornell, about what it actually feels like to we in the heart of the AI revolution and build coding agents right now. Sasha shared how these systems are changing day-to-day work and how it feels to develop these systems.A big part of the conversation was about why coding has become such a powerful setting for these tools. We discussed what makes code different from other domains, why agents seem to work especially well there, and how much of today’s progress comes not just from better models, but from better ways of using them. Sasha also gave an inside look at how Cursor thinks about training coding models, long-running agents, context limits, bug finding, and the balance between autonomy and human oversight.We also talked about the broader shift happening in software engineering. Are developers moving to a higher level of abstraction? Is this just a phase where we “babysit” models, or the beginning of a deeper change in how software gets built? Sasha had a very thoughtful perspective here, including what he’s seeing from students, researchers, and engineers who are growing up native to these tools.More broadly, this episode is about what it means to do serious technical work in a moment when the tools are changing incredibly fast. Sasha brought both optimism and skepticism to the discussion, and that made this a really grounded conversation about where coding agents are today, what they are already surprisingly good at, and where all of this might be going next.Timeline00:00 Intro and Sasha joins us01:11 What “coding agents” actually mean02:34 Why coding became the breakout use case08:56 Long-running agents and autonomous workflows15:08 How these tools are changing the work of engineers17:15 Are people just babysitting models right now?22:11 How Cursor builds its coding models26:29 Rewards, training, and what makes agents work34:53 Memory, continual learning, and agent communication38:00 How context compaction works in practice41:29 Why coding agents recently got much better50:31 Refactoring, maintenance, and self-improving codebases52:16 Bug finding, oversight, and verification54:43 Will this pace of progress continue?56:42 Can this spread beyond coding?58:27 The future of Cursor and coding agents1:03:08 Model architectures beyond standard transformers1:05:37 World models, diffusion, and what may come nextMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

How Denoising Secretly Powers Everything in AIPeyman Milanfar is a Distinguished Scientist at Google, leading its Computational Imaging team. He's a member of the National Academy of Engineering, an IEEE Fellow, and one of the key people behind the Pixel camera pipeline. Before Google, he was a professor at UC Santa Cruz for 15 years and helped build the imaging pipeline for Google Glass at Google X. Over 35,000 citations.Peyman makes a provocative case that denoising, long dismissed as a boring cleanup task, is actually one of the most fundamental operations in modern ML, on par with SGD and backprop. Knowing how to remove noise from a signal basically means you have a map of the manifold that signals live on, and that insight connects everything from classical inverse problems to diffusion models.We go from early patch-based denoisers to his 2010 "Is Denoising Dead?" paper, and then to the question that redirected his research: if denoising is nearly solved, what else can denoisers do? That led to Regularization by Denoising (RED), which, if you unroll it, looks a lot like a diffusion process, years before diffusion models existed. We also cover how his team shipped a one-step diffusion model on the Pixel phone for 100x ProRes Zoom, the perception-distortion-authenticity tradeoff in generative imaging, and a new paper on why diffusion models don't actually need noise conditioning. The conversation wraps with a debate on why language has dominated the AI spotlight while vision lags, and Peyman's argument that visual intelligence, grounded in physics and robotics, is coming next.Timeline0:00 Intro and Peyman's background1:22 Why denoising matters more than you think Sensor diversity and Tesla's vision-only bet15:04 BM3D and why it was secretly an MMSE estimator17:02 "Is Denoising Dead?" then what else can denoisers do?18:07 Plug-and-play methods and Regularization by Denoising (RED)26:18 Denoising, manifolds, and the compression connection28:12 Energy-based models vs. diffusion: "The Geometry of Noise"31:40 Natural gradient descent and why flow models work34:48 Gradient-free optimization and high-dimensional noise45:13 Image quality and the perception-distortion tradeoff48:39 Information theory, rate-distortion, and generative models52:57 Denoising vs. editing54:25 The changing role of theory57:07 Hobbyist tools vs. shipping consumer products59:40 Coding agents, vibe coding, and domain expertise1:05:00 Vision and more complex-dimensional signals1:09:31 Do models need to interact with the physical world?1:11:28 Continual learning and novelty-driven updates1:13:00 On-device learning and privacy1:15:01 Why has language dominated AI? Is vision next?1:17:14 How kids learn: vision first, language later1:19:36 Academia vs. industry1:22:28 10,000 citations vs. shipping to millions, why choose?Music:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

In this episode, we sit down with Peyman Milanfar, Distinguished Scientist at Google, where he leads the Computational Imaging team. Peyman is a member of the National Academy of Engineering, an IEEE Fellow, and one of the key minds behind the imaging pipeline in Google Pixel phones. Before joining Google, he was a professor of Electrical Engineering at UC Santa Cruz for 15 years, and he helped develop the imaging pipeline for Google Glass during his time at Google X. With over 35,000 citations and decades of work at the intersection of image processing and AI, Peyman makes a compelling case that denoising, long dismissed as a "digital janitor" task, is actually one of the most fundamental operations in modern machine learning, on par with SGD and backpropagation.We trace the full arc from classical denoising algorithms to modern diffusion models. Peyman explains how early denoisers implicitly learned from image patches, how the "Is Denoising Dead?" paper in 2010 led him to ask what else denoisers could do beyond cleaning up noise, and how that question opened the door to regularization by denoising and, eventually, to the diffusion models powering image generation today.We also dig into the practical side, including how Peyman's team shipped a one-step diffusion model on the Pixel phone for 100x ProRes Zoom, the challenges of controlling hallucinations in generative models for consumer products, and why understanding physics and the image formation process still matters in the age of large models.The conversation wraps with a big-picture debate: why has language dominated the AI spotlight while vision lags behind? Peyman argues that visual intelligence is coming next, and that, unlike language, vision requires grounding in the physical world through robotics, world models, and continuous learning. He also reflects on his journey from professor to industry researcher and why he wouldn't trade the ability to take ideas from theory to millions of users.Timeline0:13 Intro 1:42 Why denoising matters 3:20 History of denoising 5:57 How denoisers work 9:39 Why phones need denoising 12:54 Tesla's vision-only bet 14:14 BM3D's dominance 16:58 "Is Denoising Dead?" 18:21 Regularization by Denoising (RED) 24:26 RED looks like diffusion 26:19 Denoising & manifolds 28:42 Energy-based vs. diffusion models 33:46 Blind denoisers 40:30 Diffusion for text 45:44 Perception-distortion tradeoff 53:05 Denoising vs. editing 57:01 ComfyUI & democratization 58:51 One-step diffusion on Pixel 59:51 Coding agents & domain expertise 1:02:45 Diffusion for music 1:06:53 World models & continuous learning 1:15:01 Why vision will overtake language 1:21:12 Professor vs. Google 1:25:08 Wrap-upMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

Yaroslav Bulatov helped build the AI era from the inside, as one of the earliest researchers at both OpenAI and Google Brain. Now he wants to tear it all down and start over. Modern deep learning, he argues, is up to 100x more wasteful than it needs to be - a Frankenstein of hacks designed for the wrong hardware. With a power wall approaching in two years, Yaroslav is leading an open effort to reinvent AI from scratch: no backprop, no legacy assumptions, just the benefit of hindsight and AI agents that compress decades of research into months. Along the way, we dig into why AGI is a "religious question," how a sales guy with no ML background became one of his most productive contributors, and why the Muon optimizer, one of the biggest recent breakthroughs, could only have been discovered by a non-expert.Timeline00:12 — Introduction and Yaroslav's background at OpenAI and Google Brain01:16 — Why deep learning isn't such a good idea02:03 — The three definitions of AGI: religious, financial, and vibes-based07:52 — The SAI framework: do we need the term AGI at all?10:58 — What matters more than AGI: efficiency and refactoring the AI stack13:28 — Jevons paradox and the coming energy wall14:49 — The recipe: replaying 70 years of AI with hindsight17:23 — Memory, energy, and gradient checkpointing18:34 — Why you can't just optimize the current stack (the recurrent laryngeal nerve analogy)21:05 — What a redesigned AI might look like: hierarchical message passing22:31 — Can a small team replicate decades of research?24:23 — Why non-experts outperform domain specialists27:42 — The GPT-2 benchmark: what success looks like29:01 — Ian Goodfellow, Theano, and the origins of TensorFlow30:12 — The Muon optimizer origin story and beating Google on ImageNet36:16 — AI coding agents for software engineering and research40:12 — 10-year outlook and the voice-first workflow42:23 — Why start with text over multimodality45:13 — Are AI labs like SSI on the right track?48:52 — Getting rid of backprop — and maybe math itself53:57 — The state of ML academia and NeurIPS culture56:41 — The Sutra group challenge: inventing better learning algorithmsMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

We talk with Kyunghyun Cho, who is a Professor of Health Statistics and a Professor of Computer Science and Data Science at New York University, and a former Executive Director at Genentech, about why healthcare might be the most important and most difficult domain for AI to transform. Kyunghyun shares his vision for a future where patients own their own medical records, proposes a provocative idea for running continuous society-level clinical trials by having doctors "toss a coin" between plausible diagnoses, and explains why drug discovery's stage-wise pipeline has hit a wall that only end-to-end AI thinking can break through. We also get into GLP-1 drugs and why they're more mysterious than people realize, the brutal economics of antibiotic research, how language models trained across scientific literature and clinical data could compress 50 years of drug development into five, and what Kyunghyun would do with $10 billion (spoiler: buy a hospital network in the Midwest). We wrap up with a great discussion on the rise of professor-founded "neo-labs," why academia got spoiled during the deep learning boom, and an encouraging message for PhD students who feel lost right now.Timeline:(00:00) Intro and welcome(01:25) Why healthcare is uniquely hard(04:46) Who owns your medical records? — The case for patient-controlled data and tapping your phone at the doctor's office(06:43) Centralized vs. decentralized healthcare — comparing Israel, Korea, and the US(13:19) Why most existing health data isn't as useful as we think — selection bias and the lack of randomization(16:53) The "toss a coin" proposal — continuous clinical trials through automated randomization, and the surprising connection to LLM sampling.(23:07) Drug discovery's broken pipeline — why stage-wise optimization is failing, and we need end-to-end thinking(28:30) Why the current system is already failing society — wearables, preventive care, and the case for urgency(31:13) Allen's personal healthcare journey and the GLP-1 conversation(33:13) GLP-1 deep dive — 40 years from discovery to weight loss drugs, brain receptors, and embracing uncertainty(36:28) Why antibiotic R&D is "economic suicide" and how AI can help(42:52) Language models in the clinic and the lab — from clinical notes to back-propagating clinical outcomes, all the way to molecular design(48:04) Do you need domain expertise, or can you throw compute at it?(54:30) The $10 billion question — distributed GPU clouds and a patient-in-the-loop drug discovery system(58:28) Vertical scaling vs. horizontal scaling for healthcare AI(1:01:06) AI regulation — who's missing from the conversation and why regulation should follow deployment(1:06:52) Professors as founders and the "neo-lab" phenomenon — how Ilya cracked the code(1:11:18) Can neo-labs actually ship products? Why researchers should do research(1:13:09) Academia got spoiled — the deep learning anomaly is ending, and that's okay(1:16:07) Closing message — why it's a great time to be a PhD student and researcherMusic:"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0."Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.Changes: trimmedAbout: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.