Transcript
A (0:00)
Hello AI engineers. We're back with a quick reaction pod for Claud 4 with the new reasoning research lead for Prime Intellect. Will Brown Will Brown's talk at AIE NYC and open source work on verifiers have made him one of the most prominent voices able to publicly discuss the current state of the art in reasoning models and where current SOTA research directions lead. We discussed his latest paper on reinforcing multi turn reasoning in LLM agents via via turn level credit assignment and he has previewed his upcoming AI Engineer World's Fair talk on Agentic RL linked in the show notes. We're excited to share that Will will be back at the upcoming AI Engineer World's Fair in San Francisco which now has Expo tickets on sale. He will be headlining the new RL plus Reasoning track with Misha Laskin, Nathan Lambert, Christian se, Greg Kamrat, Kyle Corbett and more. Join us at AI Engineer. Watch out and take care.
B (1:02)
Hey everyone. Welcome to a Lightning plus Emergency News Latent Space podcast episode. I'm Alessio Partner and CTO at Decibel and I'm joined by my co host Wix, founder of Small AI.
C (1:13)
Hey.
D (1:14)
Hey. And yeah, honestly, we knew that Cloud 4 was coming and we just didn't. We're just too busy to like have a dedicated episode. So like this is our makeup dedicated episode with a special guest, Will Brown. From Now I can say it Prime Intellect.
C (1:29)
How's it going? Great to be on and so excited. I've known each other for a little bit and this is my first time on the podcast I believe. Great to chat with you guys. Big news day. I guess so. Lots of stuff out in the world. There's always a news day.
D (1:47)
I think this week is particularly heavy for some weird reason. Like Monday was Microsoft Build, Tuesday, Wednesday Google and then today is Claude. I wonder what tomorrow will bring.
C (1:57)
We had IO and then we had I O and then.
D (2:00)
Yeah, yeah, different iOS exactly. Yeah. So like we actually were supposed to record this morning and we all wanted to watch the Claude keynote, so we went and watched the Claude keynote. Obviously a good model, you know, good model, big model. They're really emphasizing coding. They didn't really talk much about reasoning to be super honest. They were just like, it runs for longer now. What are you guys takes?
C (2:23)
Yeah, so I mean like one thing I've kind of been seeing coming for a little bit that I think people are kind of also all aware of now is that like the thing that's going to make the next wave of stuff be powerful is just like, everyone wants better agents, everyone wants models that can like go off and do stuff. And like reasoning was kind of like a precursor to that a little bit. Like, I mean I always think of like OpenAI as like five levels framework where like Chatbots was like the RLHF era and then Reasoners was like the one and R1. But like, really what people were thinking of was reasoners are a step on the path towards agents. And so I can kind of see why Claude Anthropom is not like, oh, we have the best reasoner. They're really like showing off their suite agent and like tool tool use and like function calling benchmarks, multi turn stuff. Because I think that's really like what people care about more for actual applications as opposed to like did really good on this math competition. Like the math competition was like that stuff was all like a signal that was supposed to think we were getting somewhere. But the thing we were getting towards for a lot of people at least, is practical agents.
![[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect - Latent Space: The AI Engineer Podcast cover](/_next/image?url=https%3A%2F%2Fsubstackcdn.com%2Ffeed%2Fpodcast%2F1084089%2Fpost%2F186632787%2F86bb0f264bc4b333f8a90e3bf505073b.jpg&w=1920&q=75)