Can We Guide Artificial Thought? - The Last Invention is AI

Summary6 min read

Joe Rogan Experience for AI
Episode: Can We Guide Artificial Thought?
Release Date: July 21, 2025

Introduction

In this episode of the "Joe Rogan Experience for AI," host Joe Rogan Experience for AI delves into the emerging consensus among leading AI companies regarding the monitoring and understanding of artificial intelligence's reasoning processes. The discussion centers around a recently published paper by top AI researchers and explores the implications of transparency in AI decision-making.

Industry Unity and the Positioning Paper [00:00 - 10:30]

The episode begins with Joe highlighting a significant development in the AI sector: a surprising unity among industry giants such as OpenAI, Google, DeepMind, and Anthropic. These companies, along with their leading researchers, have collaboratively published a paper emphasizing the importance of monitoring AI's "thoughts" and its reasoning processes.

Joe explains that the paper focuses on "Thoughts of AI Reasoning Models", particularly the concept known as Chain of Thought (CoT). This refers to the step-by-step reasoning that AI models undertake to arrive at answers, akin to how humans solve complex problems by breaking them down into manageable steps.

"Chain of thought monitoring represents a valuable addition to safety measures for Frontier AI, offering a rare glimpse into how AI agents make decisions."
— Positioning Paper [04:45]

Chain of Thought: Understanding AI Reasoning [10:31 - 20:00]

Joe provides an overview of how Chain of Thought works across different AI models. He compares the reasoning process of AI to humans working through a complex math problem, where the AI breaks down the problem into smaller, logical steps before arriving at a solution. This approach contrasts with earlier AI models that attempted to generate answers in a single step without transparent reasoning.

He highlights that while companies like DeepMind quickly adopted and enhanced their models using CoT, OpenAI initially kept the inner workings of their reasoning processes guarded. However, other companies have since introduced features that allow users to view the AI's step-by-step thought process, enhancing transparency and trust.

"You can essentially see exactly what's going on... it's useful."
— Joe Rogan Experience for AI [12:15]

Monitoring and Safety Concerns [20:01 - 30:00]

The core of the discussion revolves around the concept of monitorability in AI models. The positioning paper advocates for maintaining the visibility of AI's reasoning processes to ensure safety and alignment with intended outcomes. Joe emphasizes that without such transparency, AI could become a "black box," making it difficult to ascertain whether it remains aligned with human values and safety standards.

However, Joe introduces a critical perspective by speculating that maintaining CoT visibility might also serve competitive interests. By allowing others to observe the AI's reasoning, companies could inadvertently enable competitors to reverse-engineer and replicate successful strategies.

"If you're one of these AI researchers and you want to reverse engineer how other models are staying best in class... looking at the chain of thought could be a way."
— Joe Rogan Experience for AI [26:50]

Competitive Dynamics in the AI Industry [30:01 - 40:00]

Joe delves into the intense competition within the AI sector, noting actions like Mark Zuckerberg hiring top researchers from OpenAI. This "bloodbath" for talent underscores the high stakes involved in developing superior AI models. Joe posits that the push for CoT monitoring might be a strategic move to standardize safety measures across leading models, ensuring that no single company gains an overwhelming advantage without transparency.

He questions whether the positioning paper is a genuine call for enhanced safety or a tactic to level the playing field among top AI companies, making it harder for new entrants to leapfrog established players with proprietary advancements.

"These are already all the top companies... are they trying to make sure that everyone plays by the same rules?"
— Joe Rogan Experience for AI [33:20]

Future Directions and Open Questions [40:01 - 50:00]

Looking ahead, Joe discusses Anthropic's ambitious goal to demystify the AI "black box" by 2027. Dario Amadei, CEO of Anthropic, aims to develop techniques that will elucidate the underlying algorithms and processes driving AI decision-making. This endeavor is seen as crucial for ensuring long-term safety and alignment of AI systems.

Joe underscores the paradox that, despite advancements, the fundamental workings of AI models remain largely opaque. The pursuit to fully understand and explain AI reasoning is portrayed as both a technological and ethical imperative.

"It's crazy to think we don't even know how these AI models work. We just train the algorithm and it gives us a good result."
— Joe Rogan Experience for AI [48:10]

Conclusion and Final Thoughts [50:01 - End]

Joe wraps up the episode by reflecting on the significance of the positioning paper and the collective stance of top AI companies on chain of thought monitoring. He reiterates the importance of transparency for safety but remains skeptical about the underlying motives, considering the fierce competition in the industry.

Jerry acknowledges that while the current measures are a step in the right direction, the future of AI safety hinges on continued collaboration and openness among researchers and companies.

"Bowen Baker... said, we're at this critical time where we have this new chain of thought thing. It seems pretty useful, but it could go away in a few years if people don't really concentrate on it."
— Joe Rogan Experience for AI [45:30]

Key Takeaways

Chain of Thought (CoT) is a pivotal concept in AI reasoning, enabling step-by-step problem-solving similar to human cognitive processes.
Leading AI companies have united to emphasize the importance of monitoring AI's reasoning to ensure safety and alignment.
Competitive dynamics in the AI industry may influence the adoption and transparency of CoT, potentially serving both safety and strategic interests.
Anthropic's initiative to decode the AI "black box" by 2027 represents a significant effort towards greater transparency and understanding of AI algorithms.
The future of AI safety relies on sustained collaboration and openness among researchers and industry leaders.

Notable Quotes

"Chain of thought monitoring represents a valuable addition to safety measures for Frontier AI, offering a rare glimpse into how AI agents make decisions."
— Positioning Paper [04:45]
"You can essentially see exactly what's going on... it's useful."
— Joe Rogan Experience for AI [12:15]
"If you're one of these AI researchers and you want to reverse engineer how other models are staying best in class... looking at the chain of thought could be a way."
— Joe Rogan Experience for AI [26:50]
"It's crazy to think we don't even know how these AI models work. We just train the algorithm and it gives us a good result."
— Joe Rogan Experience for AI [48:10]
"Bowen Baker... said, we're at this critical time where we have this new chain of thought thing. It seems pretty useful, but it could go away in a few years if people don't really concentrate on it."
— Joe Rogan Experience for AI [45:30]

Final Remarks

For those interested in exploring the latest AI models, Joe briefly mentions AI Box, a platform offering access to over 40 AI models with features like Media Storage to easily manage generated content. The service is available for $19 a month, consolidating various AI subscriptions into a single platform.

"If you enjoyed the podcast episode today, the number one way that you could say thank you is to leave a rating and review or comment wherever you get your podcast."

This episode provides a comprehensive look into the current state of AI reasoning transparency, the collaborative efforts of top AI companies, and the ongoing balance between safety and competitive advantage in the rapidly evolving AI landscape.

Loading summary

Transcript1 lines

[00:00]
Host
In a very interesting move, there has been some, you know, unexpected unity in the AI market as a bunch of industry leaders from a bunch of top AI companies have all got together and published a paper urging essentially the top AI companies to monitor AI's thoughts and how it's actually arriving at questions. So today on the podcast, I want to break down what is currently being done in that regard, what needs to be done in the future, who's leading the way and. And what, you know, what this whole paper was all about as it has a lot of people talking. Before we get into that, if you want to try any of the top models from companies that I talk about on this podcast, I'd love for you to try out AI Box. AI, my personal company has just launched our beta, and you essentially get access to the top 40 AI models so you can chat with all of them in one chat thread. And in addition to that, we have something called Media Storage, where any file that you generate, whether that's images or whether that's audio, all of that gets stored in the media storage folder. So for me, this was something I really wanted, I hated with Chat, GPT or other platforms, you generate something in a chat thread and you couldn't find it afterwards. Like, you couldn't figure out where it was. So we have one place where everything is stored. You're able to access all of it, you're able to see the prompt that generated that particular image or file, and you're able to click and go back to the exact thread that was used to generate it. So there's a ton of cool features in here. I'd love for you to try it out. You can. It's $19 a month, so that saves you these $20 subscriptions that every platform has. You can get all of them all on one platform for one price. You go check it out. AI Box AI. The link is in the description. All right, let's get into what the researchers have been saying in regards to AI's thought process. So the top. The companies that essentially are kind of involved in this is OpenAI, Google, DeepMind, and Anthropic. And not really the companies, but top researchers at all these companies came together, put out a paper, they all kind of signed it. So what they're really talking about is something called Thoughts of AI Reasoning Models. This is what they, what they, what they're kind of researching is the thoughts of AI reasoning models in this paper that they've published. And really what they're getting to is something called Chain of Thought. But it's, it's kind of how these reasoning models and do these, you know, they, they do the deep dives, the deep thinking, the all of these kind of tools that are on there. It's called different things on every single AI model. But basically the way that the AI arrives at an answer, they want it to be studied. So OpenAI was kind of the first one that came out with this. When they came out with their O3 model and it had, you know, they're like, this thing has, has reasoning. They didn't explain how the reasoning was done. And that's because they were the first ones with it. They didn't want to like give away the secret and tell everyone what they were doing. But basically most AI researchers understood what was going on. Deep Seek very quickly cloned the tool and, and had a meteoric rise in how good their AI model was once they did it. So it really showed. Oh my gosh. Like, this technique is the way to turn these AI models into best in class AI models. So when Deep Seek did, it went super viral. Everyone was talking about how good that AI model was. But basically what it does is the same way. Like when a human is working on a complex math problem and you're, you know, you're looking through it, you're reasoning, you're writing down notes. This is what the AI models are doing. They're not just trying to one shot, you know, based off of everything, you know, off the top of your head, what's the answer to this, which is how they were working before. It's like if, you know, based off of everything, you know, work through the question, come up with, or it would, it would say things like, you know, come up with like 20 steps to figure out the solution to this problem. It's like, okay, first we ought to figure out X, Y and Z. Then we got to figure out, you know, what the user means by this. Then we got to figure out if the intent that I think is actually correct. Right? So it has like all the, all these steps and it breaks it down. Now you've probably seen this on a lot of AI models because everyone has now some sort of feature like this. My favorite is I think Anthropic is doing a good job. I like how Deep Seek and Grok both show you line by line, the thought process. You can kind of drop it down and see the thought process that the AI model ran through in order to get to your result. That was something that OpenAI never launched with because I think they're trying to guard Their secret at this point, I don't think it really matters, but you, I think it's useful. You, you can essentially see exactly what's going on. And so this is what this paper is all about. Essentially we have what they're calling monitorability. So we can see how they're arriving at their questions, more or less. This is not perfect, but it's a pretty good idea of how they're arriving at their, at their answers to questions. And so the AI researchers are essentially saying we need to keep this monitor ability. So it's called a chain of thought. Um, anyways, this is directly from their paper. It says chain of thought monitoring represents a valuable addition to safety measures for Frontier AI, offering a rare glimpse into how AI agents make decisions. Yet there's no guarantee that the current degree of visibility will persist. We encourage the research community and Frontier AI developers to make the best use of chain of thought monitorability and study how it can be preserved. Okay, there is an interesting sneaky angle to this, and I don't know if this is actually true, but if you think about it, when OpenAI first came out with the, you know, this chain of thought or these reasoning models, they didn't release how it was coming up with the answers. They didn't release what, you know, they're calling this monitorability device that you get from GROK or Deep Seq, where it shows you exactly what it's thinking. And that's because they didn't want people to know, you know, to reverse engineer the prompts that they were telling, you know, how I said, like, you know, basically there's a prompt that will tell it like break this into 12 parts and figure out what you need to do, Yada, yada. Well, OpenAI and no one actually tells what the, what the previous, what the pre prompt is before it runs through the process. So you can get an idea by looking at the process of what the pre prompt is. You can reverse engineer it, but you don't know for a fact. And so what's interesting here, they're saying, you know, there's no guarantee on how long this can exist for, because right now GROK and Deep SEQ and others are like showing you, but they don't have to, like, they could just give you the result and just do what OpenAI used to do, which is just have this little loading bar that said thinking, but. And they could just say, thinking, thinking, thinking. All right, here's your result. Now that's what the AI researchers don't want. And they're saying it's for safety reasons, right? They're like, hey, we don't want anyone to, you know, we don't want to forget how it's getting to the answers. And then it's a bit more of a black box and we can't figure it out and we don't know if it's aligned or safe or yada yada. But a sneaky angle that I've been thinking of is if you're one of these AI researchers and you want to reverse engineer how other models are staying best in class. For example, Grok 4 just came out and it proved that its model was better on all the benchmarks in for reasoning than, you know, it was the best model, best in class model on all the benchmarks. Now maybe that's just because their model's bigger, but at this point a lot of people are saying it's because of some of the tools and the pre prompts and some of the ways that the model has those things worked into it. And the best way to reverse engineer what those tools and pre prompts and kind of some of these, the secret sauce of the model would be to look at its chain of thought. And so I think an interesting sneaky angle would be if these researchers are worried that, hey, if everyone turns off their chain of thought, we won't be able to copy everyone else's like, good ideas as well. And so when something makes a breakthrough, it's going to be harder to copy it. Now is this the real reason? Is it all about safety? I don't really know. I'm just saying if you have chain of thought monitoring available for different AI models, it does make it a little bit easier to kind of copy the secret sauce of other AI companies. So, you know, leave that with. Leave that for. For what? Or take that for what it's worth. The other thing I do think that's interesting wor that's worth mentioning is the people that are signing off on this OpenAI, Google, DeepMind, Anthropic, these are already all the top companies anyways. They're already at the very top of the pack, right? So you throw Gro in there in between the five of these companies or the, sorry, the four of these companies, they're basically like all the top models, the LLM models that have the highest, you know, the highest benchmark scores. And so what's interesting is like, are they trying to make sure that everyone plays by the same rules? Are they trying to make sure if anyone has a breakthrough that they can kind of copy them Are they trying to stop any new companies that might leapfrog everybody with some secret, secret sauce? And then that everyone has to scramble to figure out what they're doing. Do they want there to be regulation that forces us to keep chain of thought like, so I'm not really sure exactly where it goes. And at the end of the day, there's. They're not calling for any sort of regulation or anything. And there are big names that are talking about it, right? Or signing off on this. You have safe Superintelligence CEO Ilya Escover who used to be over OpenAI Nobel laureate Jeffrey Hinton Google DeepMind co founder Shane Leg X AI safety advisor Dan Hendricks Thinking Machines co founder John Schultzman So very, very big names in the industry are all essentially, you know, talking, talking about this and signing off on it. So this is really interesting. This is coming at a time when there is cutthroat competition in the AI industry. Mark Zuckerberg is hiring everyone's top researchers. As of this morning. I saw he hired two more top researchers from OpenAI. So it is an absolute bloodbath when it comes to what you have to pay for to keep up with these researchers and what people are willing to do, what these companies are willing to do to make sure that their AI is ahead of everybody else's. So it's very, very competitive. And I wouldn't be surprised if there was some sort of competitive angle. This is one thing that I did want to say from, from Bowen Baker, who worked on this particular paper. He said, we're at this critical time where we have this new chain of thought thing. It seems pretty useful, but it could go away in a few years if people don't really concentrate on it. Publishing a positioning paper like this, to me is a mechanism to get more research and attention to this topic, because before that happens. So the paper isn't any groundbreaking news. It's not calling for any groundbreaking action. It's just putting. It's. It's, you know, they're, they're calling a positioning paper. They're like, just so you know, we had this chain of reasoning for safety reasons. This is good. Everyone should keep doing it. So it'll be interesting to see if that holds any ground, if that actually makes any changes, if anyone is actually forced to do anything. What I will say in regards to this, that I feel like might be even more important is that Anthropic is one of the top companies that's really been working on a lot of this safety stuff. And earlier this year, their CEO, Dario Amadei, he announced that they were pretty much making some sort of commitment to open the quote, unquote, black box of AI models by 2027. So even when we do the chain of thought, we don't always know why an AI model will give the responses that it gives as it works through that chain of thought that you, that you may have given it. So even if you were like, okay, we see why it came up, this answer because of this chain of thought, but it's like, how did it get to all the individual pieces? That is a black box. Still, we don't really know how the A model does it. It's algorithms. It's guessing what comes next in a line or a letter or a token. And so Dario Amadeo has some really clever software and techniques that they're, they're working on. And his goal is to have cracked open the black box and explain exactly how the AI models, algorithms work to get to the responses it gets to. And he's hoping to do that by 2027. So very interesting. It's crazy to think we, we don't even know how these AI models are work. We just train the algorithm and, and it gives us a good result. We don't really know why, but it seems that that is going to be coming sooner than later and we're going to then be able to understand the alignment of these models, how safe they are if they're going to go off the, the rails. Like we recently had Grok Xai's Groq kind of go off the rails recently. And so we'll see, we'll be able to get a bit of a deeper look into all of that. Hey, listen, if you enjoyed the podcast episode today, the number one way that you could say thank you is to leave a rating and review or comment wherever you get your podcast. Comments on Spotify, comments on YouTube, all of it helps the podcast a ton get found by more incredible people like yourself. Thanks so much for tuning in. Make sure to check out AI box AI for all the latest AI models and I will catch you in the next episode.