
To agree to international AI red lines, we need to build the technology that makes it possible to adhere to them. In this episode, Tristan sits down with two experts in this field to discuss the kinds of verification technology we need for AI, the challenges of building it, and the world it could unlock if we do.
Loading summary
A
Foreign. Welcome to your undivided attention. This is Tristan Harris. In 1965, 20 years after the first test of a nuclear weapon, the Trinity test, a reporter asked Robert Oppenheimer whether it was too late to stop the spread of nuclear weapons. And at the time, five countries had developed their own atomic bombs. His answer was short and chilling. It's 20 years too late. It should have been done the day after Trinity. But Oppenheimer was wrong. It wasn't too late. Nuclear deproliferation and disarmament did happen. And over 60 years later, only nine countries have nuclear weapons. Even the person who created this technology, who was convinced of its inevitability, couldn't imagine how the future might, might unfold. So how did this nuclear non proliferation happen? Well, it happened largely because of technology. The biggest obstacle to agreeing on nuclear red lines was that adversaries couldn't trust any promise the other made. They needed to be able to verify the number of warheads and they needed to know if a nuclear device was for a weapon or a power plant. Now, none of that was possible until we built the technology needed to verify those things. And today we're in a similar situation with AI. In order for adversaries like the United States and China to agree on reasonable red lines, or on things like bioweapons, cyberhacking, or the risk of recursive self improvement, they first need to be able to trust each other. And so we urgently need to build the verification technology that would make that trust possible. So today I'm so excited to have on the show two experts in this area to talk about the kinds of verification technology. We need to think about how we would do this for AI. Tim Fist is the Director of Emerging Technology Policy at the Institute for Progress. And Janet Egan is a Senior Fellow and Deputy Director for the Technology and National Security Program at the center for New American Security, or cnas. Tim and Janet, welcome to your undivided attention.
B
Thanks for having me.
C
Thanks for having me.
A
So, just to level set for listeners who really don't know that much, just for a regular person out there, why does coordination on AI matter? What would happen if we didn't have coordination?
C
The fundamental premise is that AI and its impacts will be global regardless of who develops it. So it does matter which jurisdiction gets the transformative capabilities first, but it doesn't matter in terms of them having global impacts. Risk that eventuates in one country doesn't respect national borders and can easily move across and impact global equities. And we're no longer in the Kumbaya Globalist zeitgeist of the 1990s, where everyone was building up global institutions. We've kind of moved into a different realm where there's like diminishing engagement in international rules and lower trust between different international counterparts. So I think this means we really need to be preparing for a world where any agreements that protect collective global interests aren't just based on trust, but are based on the ability to verify that folks are following the rules.
A
Do you want to add to that, Tim, in terms of how the consequences of AI are global and not contained to one country?
B
Yeah. So I think it's interesting to put this in the context of current events. I think over the last few months, we've had all three of the leading US AI labs say that having the option for a global slowdown or pause in AI development is something that they would support. So this is coming from DeepMind, Anthropic and OpenAI, which is kind of like a big deal if we take them at their word for why they say they want this kind of thing. They think they're not that far off from building AI systems that can exhibit what's called recursive self improvement, or rsi. And what that means is an AI system that's capable of autonomously designing and then building its own successor. And I think the risks that these people point to is if this happens, it could have two big consequences. So one is on the misuse of AI. So if we see this rate of capability growth happening far exceeding what we've seen over the past few years, it could lead to sort of much greater risks in the near term future of people using AI to do dangerous stuff. And the other risks that these people point to is the risk of loss of control. So humans losing understanding of the AI systems that they're building, leading to the creation of a model that we can't control. And we also don't understand how it works. And so it could be misaligned with human interests. And so, yeah, what these labs are calling for, what we might want in such a situation, is time for the world to take coordinated action so that societal institutions and alignment research can keep up. And what you really need for that is some way to verify that everyone's following those same rules and actually engaging in that kind of coordinated slowdown.
A
Right. So just to back up for listeners, because Anthropic recently did publish this letter about a need for a global slowdown, but they noted that if one lab chooses to slow down, and that doesn't stop China from slowing down. Then they're just basically sacrificing the current lead that they have, and you're back to the basic fundamental arms race that, you know, everyone is racing to build more and more powerful models for the fear that if you have a more powerful one and you can use it over me, AKA China gets Mythos and can hack the US Before US gets Mythos and can hack China. Just that paranoia alone creates the kind of pressures for continuing to advance on the capability curve. But we get back to, how could these labs and countries and companies actually verify that they're doing the right thing and they're going to uphold their agreement? Because we all know they're going to say, oh, I'm going to do the right thing, but then secretly, I'm going to build it in a black project in an underground bunker, you know, military base or, you know, data center that's buried underneath the earth. And so that brings us to the conversation we're having today. How would you make this relatable to someone who doesn't understand or think about, you know, verifying AI treaties? What's a story from history we might point to?
B
So there's a couple of examples from the nuclear space about this fundamental idea of a technology enabled an agreement to happen. One is the seismic monitoring system that allows treaties like the Comprehensive Test Brand Treaty, the cddt, to happen where, because we have the technology to detect underground tests, 300 monitoring stations distributed globally across up to 100 countries. Those monitoring stations allow us to detect underground tests, which then allow you to have an agreement that bans underground tests, because without that technology, you would not be able to verify whether the agreement was being complied with. And we did see that every single signatory to this treaty has not engaged in nuclear testing, which is sort of, in my view, a very big success story. Another really interesting one is from the Intermediate Range Nuclear Forces Treaty. So this was a treaty that was signed to try and prevent a whole category of nuclear weapons from existing, which were those with flight times of less than 10 minutes, because that is extremely dangerous. You don't have much warning. You could sort of attack immediately. And so this was primarily targeted at launches that were located in Europe, between sort of like Europe and the ussr. And this treaty was actually enabled by an X ray scanning technology. This was called cargo scan, which they placed this technology, the US and the Soviet Union developed this together and deployed it at Soviet missile factories. And what this technology did is for every single rail car that was coming out of this missile factory that was scanned by this X ray machine to measure the diameter of the missile to ensure that it was not one of these intermediate range missiles that have a flight time of less than 10 minutes. This is cited as the key thing that actually made this treaty around this kind of nuclear weapon possible. So the existence of that X ray technology made it such that we could put in place this agreement that we wanted to have.
A
So someone had just said invent the technology and realized that the diameter alone of the size of the missile would essentially be enough to get a, as sort of a signal of what was going on in that missile and whether it matched the terms of the agreement exactly.
B
And the interesting thing about this is that the reason that they used X rays as opposed to other kind of scanning techniques is it gave enough information to measure the diameter of the missile, but not other parts of the design of the missile which were seen as sensitive secrets that the Soviet Union wanted to keep secure. And so this is also an example of a technology that was deployed that was sufficiently privacy preserving such that both parties were happy sharing that information.
A
Beautiful example. You hear the phrase all the time in AI governance, you know, trust but verify. And it's almost paradoxical. If you're verifying, you're not really trusting. And this phrase, you know, came from, I think it was a Russian proverb that President Reagan eventually adopted and popularized during nuclear negotiations because he didn't want to say, you know, reduce your arsenal to this many nukes, but then I'm not going to check if you actually did it.
B
I actually learned a couple of words in Russian in order to talk about this with the General Secretary.
C
Dover y no provy.
A
That is a proverb that Russia that says trust but verify. So what does it actually mean when we apply this concept to AI? What are we trusting and what are we verifying? Janet, just curious to hear your answer.
C
The thing that I think makes AI a really tricky case when we're thinking about verification technologies is that the only really certain thing about AI futures is that it's very uncertain. There's just a wide variety of futures that we might want to prepare for and a wide variety of risks that we're continuing to surface today and understand. The one thing I'll go back to double click on, so Tim talked about the labs, highlighting that there's growing demand for potentially coordinating on a slowdown. But I think we're also seeing a policy window open between the US and China, which makes work in this area really prospective. After the Trump XI summit, we heard Trump talking about that safeguard collaboration was on the table. We also heard Besant talk about that the two AI superpowers are going to start talking and set up a protocol in terms of how we move forward with best practices. So it's not just the people at the forefront of AI science who are saying, oh, look, we're starting to feel a bit worried about how some of these risks might eventuate absent international coordination. We're also seeing the signs coming from the world's two superpowers. And so I think what that means is actually starting to build up the verification technology base so that we don't have to trust, so that even in low trust international environments, we can build the ecosystem where when rules are set and consensus is reached, we can rely on each other to actually follow those rules.
A
You're mentioning something really important, which is we've believed that you need to have international coordination for a long time, but if I said that just three months ago, you would have called me crazy. I mean, look at the political forces in the world, China and the US ever talking about coordinating AI. That's never going to happen. And it actually did happen at the Trump Xi Summit. And it seems like Mythos was the reason that this happened. Do you agree?
B
Yeah, totally. I think the US and China are both worried about models with the cyber attacks, cyber offensive capabilities of Mythos falling into the hands of criminal or terrorist actors. Obviously there's big consequences for the global financial system and shared infrastructure if these models are misused. And I do think that there's that sort of like talking about misuse by non state actors and that there's the coordinated slowdown that labs are talking about. And I think that these could in principle, and I'm sure we'll get into this in more detail, but these could in principle share a lot of the same underlying verification technology in a way that today's moment around Mythos could be really used productively in the future if we want that optionality.
A
So we have Mythos, we have China and the us they're expressing an interest to coordinate, but they don't trust each other. What would be some of the infrastructure we need for the US and China to practically and what are the risks that we're protecting against? Because as you mentioned, there's some common infrastructure, maybe some differences if we're trying to prevent cyber risk and Mythos level things versus preventing AI loss of control.
C
When we think about what parts of the AI tech stack are most governable or most observable or controllable or monitorable, compute seems to be the Key target here. And that's because when you think about the other parts of an AI model, so whether it's the data or the algorithms or the model itself, at the end, those things are much harder to control because they're not physical. They can be copied and pasted, they can be duplicated. That compute is like a physical piece of the supply chain. It's got a very narrow supply chain. The US and its allies have quite strong control over the compute supply chain currently, although China's starting to work up slowly to try and indigenous their own focus a lot on compute and how that can be used for verification. And then we come to the, well, what are you actually trying to verify? What are you trying to do in this space? I think this is where the actual consensus is still working its way through the pipeline. But, but my strong push here is that you don't need to have consensus on exactly what you want to govern. You want to have the technology ready to allow for that governance once that consensus is reached. And we can see the similarities here in the nuclear paradigm as well. When people reach consensus, they had tactics and techniques ready to start implementing to prove that different states were adhering to their commitments. So the newest chips from Intel, AMD and Nvidia all ship with something called a trusted execution environment or similar, which is basically a part of the chip that when you run a model or program inside this vault, the hardware can take a fingerprint of exactly what was run and signs it. So the lab itself can say, based on this hardware, based on this cryptographic key, based on the hash we are generating, we can actually show that this is a statement that is true. Now, this is pretty innocent, but people are actually starting to progress. The technical reality is there, but how we actually use that for different verifiable claims is still being progressed.
A
So let's just stop there for a second. So we're talking about compute. By compute we mean chips. And then you're telling me that I think would be a surprise to most listeners that the existing chips that are shipping out there actually have some kind of controls on them. It would be like if we're shipping uranium around the world, but then the uranium says, well, if I'm being used for a nuclear power plant, I'll emit this signal, and if I'm being used for a nuclear weapon, I'll emit this signal. Now this would strike people by surprise because they think a chip is a chip. It just runs, you know, computation. So help people understand. Has this happened for a long time when did people put in this system? Because what you're basically pointing to is there could be an optimistic case with AI, There is this finite resource of chips. And so in this bottleneck, you're saying there's actually a way that that bottleneck could be controllable so that it could, for example, serve for an international agreement.
C
Yeah, I guess these mechanisms on the chip are already built in for security purposes. So when you want to secure bit your chip, you need a very secure component on the chip to rely on. So there's that. What isn't yet as developed is like, how do you actually use those components to make verifiable claims? But there's also another aspect here of compute providers themselves using telemetry from chips and how they're being used to say, okay, this cluster was used for training versus running inference. And that again is another area of science that like initial research starts to show that, yeah, just looking at the telemetry not even touching the data that's underneath it, you can start to get indications of what a cluster's being used for.
A
Could you explain what is telemetry
C
signals from how the chip is being used that isn't the data inside the chip.
A
So that would be like the electrical signals coming off of a chip or what kind of signals would we be talking about?
C
Yeah, so examples might be how much energy, how active the chip is, the runtime of the chip, the usage of the chip.
A
So we're talking about basically signals that you could pick up through those mechanisms that would tell you when you say training versus inference, just to remind listeners the difference between a chip that knows that it's training GPT6 versus a chip that says I'm only running GPT5. And we might have an international agreement where like everyone's allowed to run GPT5, but you're not allowed to train GPT6 because that would create this risk of some dangerous AI that we don't want to create. And so you're saying that level of difference in the chip architecture would help us do that.
C
So initial research is showing that you can start to differentiate between maybe not training GPT6, but training and running inference. Right.
A
So training in general versus we don't know what you're training, but theoretically that does point to something. So just even right here, if there is some agreement that we're going to do an anthropic set, we're going to do a pause of some kind, you would theoretically be able to know, is anyone training any AI anywhere in the world? If all of the chips were activated to use this feature on the chip versus is everybody just using the chips that they have to run the existing AI model? So you're saying right now the chips that are shipping in the world have that capacity?
C
Right now the chips that are shipping in the world have the capacity that if the person using the chip wants to attest a claim, a positive claim about something, they can often do that.
A
I see. So it's not on the chip.
C
It's.
A
If the person who's running it wanted to run this thing, then they could. And that could be feeding into some kind of structure.
C
Yes.
A
Tim, what are your thoughts on this?
B
Yeah, I guess to sort of restate the principles here, like we've talked about the idea that if you want to do anything serious at the frontier of AI development, you need access to a large number of chips. And so the things that you would want to verify as stuff to do with how are you actually using those chips? Are you using it for the stuff that we've agreed is good, like alignment research, or are using it for things that we've agreed not to do? Like, let's say we've agreed for it to slow down and we're not going to train the next big model. And so fundamentally you're trying to verify things about how these chips are being used. And I think it's worth sort of talking a little bit more about what makes this possible, which is the fact that this is just such a concentrated supply chain. So if you look at these chips, and Janet mentioned some of the features that are already on them. But the reason why it's possible to intervene on this supply chain this way is like something like 90% of the world's AI chips are made by one company, which is Nvidia. So they're designed by Nvidia. 90% of those are manufactured by one company, which is TSMC. So they manufacture the chips at sort of a fab or like a fabrication plant. And then around 70% of those chips, when they're sent out into the world and are actually being used, are used by big US cloud computing providers like aws, Microsoft and Google. So we have this kind of hyper concentrated supply chain where you only need to coordinate among a few actors to, let's say, propagate design changes to chips throughout the ecosystem or sort of get visibility over where the chips are or ensure sort of like the chips are all being used according to a common set of rules. And yeah, there's like a couple of technologies today, as you mentioned, that makes all this possible one is just cryptography, which is a very widely used kind of technology. So you can kind of say, hey, I have a piece of secret data that I want to reveal to you. I don't want to reveal it directly, but it contains this sense of information that I want to prove to you. So let's say this is like a log of how you're using your chips over a given time period. And you don't want to reveal this data directly, as it might contain sensitive IP that you don't want to reveal to an adversary, but you can share a finger, fingerprint of that data publicly, where the fingerprint only ever corresponds to your private data being this log of how you use the chips. And so in that way you can kind of have a secret use cryptography to prove it. And cryptography is obviously a very widely used technique throughout the global economy, including in finance, Internet transactions. And so the combination of these two things is actually a really good starting point for lots of the verification applications that we'll talk about today.
A
So the basic fact here is that if I'm US or China, I don't want to tell my adversary exactly what I'm doing, but I do want to give them the kind of confidence and trust that I'm not doing the thing I said I wouldn't do in the agreement. And so you're saying there's a way that I can keep some of the data of what I'm doing private, but then have a cryptographically verified way that both parties know that the other is not doing the bad thing without revealing the stuff that they are doing.
B
Yeah, that's right.
A
So it seems like there's two things here that I want to raise. So one is you just mentioned that part of the reason why any of this would be possible is because of technically a kind of a problem too, which is a massive concentration of power, that there happens to be essentially a handful of cloud providers, a handful of people who make and design the chips, really, just as you said, Nvidia and TSMC doing the vast majority. And you would need to be pulling from those providers to get the really frontier AI that we're talking about. And then the other thing I heard you mention is just we'd have to know where all the compute is in the world. Just like if there's some dark uranium somewhere in the world that we don't know about that's actually getting sold to some bad actor, then the scheme we have doesn't work. And so can you talk a little bit about what are the things that we need to know about all the compute in the world and do we have the mechanisms to know that or know enough of it that this scheme would work at all? Janet or Tim, do you want to jump on that?
C
Yeah, I think this is the hard part. Where a lot of this comes down to, and this is similar to sort of the nuclear approach as well, is like accounting. You've got to say how much is not declared or unaccounted for. And that's what also happens with nuclear stockpiles as well. I think the difficulty here is that there's already probably a lot of compute in the world that isn't clearly identifiable and isn't tracked. And I think there's a few different ways you can approach this. So the first is that there's a bunch of organizations thinking about what are retrofittable devices that you could add into data centers that are not on the chip itself, but sit next to the chip that can guarantee what a data center or datacenter cluster is doing. And so data centers at the moment are quite easy to identify from space. I think that might slowly change as the UAE shifts towards maybe thinking about building underground because of threats from the geopolitical environment in their region. But I think in general at the moment, really large clusters are pretty easy to identify and find because there haven't been strong incentives to hide that kind of behavior. So there is a possibility to retrofit data centers. And the tech is still nascent, but still emerging. And there's a lot of people working on this to say what are tamper proof processes that you can add into a data center or add next to chips that can also provide some oversight and monitoring of how compute clusters are being used.
A
Essentially we're building up the stack of what are the different mechanisms at each level that we would need to have some kind of verification. So one is monitoring the supply of compute 1. The second is knowing where all the data centers are. And you're saying that roughly most of the data centers are built above ground in places that we know they have heat signatures, you can pick them up from space. And you're also talking about retrofitting data centers with some kind of thing that we're bolting onto the back of them. So that lets them kind of do the verification. So for example, if the US and China were to sign an agreement, they would have to, we'd make a map of here's all the data centers and then we have to do some verification that each of them got this like retrofitting. We did that well. And then there was another element in what you said that I want to make sure people track, which is the tamper proofness. So yes, I'm putting a tracker on my data center, but here's how. I can't just sort of, you know, hack that, that, that reporting device to kind of give good results while I'm secretly doing a bad thing. So it has to be tamper proof. Is the tamper proof aspect is that well developed and done or is that still in research?
C
Tamper proofing is notoriously hard because essentially you're trying to model one of the most sophisticated actors in the world trying to tamper with something and to make something adversarially robust. That takes a lot of time and experimentation for a reality check. I think these mechanisms are still a way off and I think we need more incredible minds and incredible engineers working on these things. And I think that hardening it to make it adversary proof I think would be well over a year away, but needs more work on it. But we can look at the nuclear non proliferation case study, for example here. So they have 24, seven camera surveillance of nuclear stockpiles in countries.
A
24, seven monitorable surveillance.
C
That's right. And so the IAEA also has like you have tamper with seals and you also have cameras that are pointed at certain stockpiles, 24,7 with live feeds. And I think that sort of analogy can also show that sometimes the solutions can be like the very basic bread and butter things that are outside the cutting edge of tech, but are still like in person inspections and. Yeah. Ongoing monitoring.
B
Yeah.
A
Tim, just curious, what are we missing from this picture of the tools that we need and has a. Compared to some of the lessons we learned from nuclear.
B
Yeah. So there's many different classes of verification technology. Right. We've talked through a few of them. I guess something I want to emphasize is that if you take the sort of set of technologies that exist on chips today. So we talked about encryption and confidential computing. There's two key ones. These give you the ingredients to create a workable verification regime today, but one that is extremely brittle and easily broken by someone who wants to tamper with the chips to remove the features that you have there, that Nvidia has built a lot of these features and put them in already. I think that if you try to do something within sort of like a 12 month time span, you could potentially get by by layering these sort of fundamental verification technologies on the chips themselves with a bunch of low IQ options of the kind that Janet mentioned, one being human inspections, which is kind of like the lowest technology option. So in the nuclear space, human inspections have played a really big role. So the Newstart treaty that governed nuclear weapons, human inspections have been used to do randomized low notice time inspections of missiles to check how many warheads were actually deployed. And the same thing is done by the International Atomic Energy Agency, the iaea. They do short notice inspections of uranium production and usage. So they do thousands of inspections a year at places like power reactors and enrichment facilities. And you can imagine something similar going on in the chip supply chain. And it turns out that through the principle of random inspection, you only need to actually do a small number of inspections to make sort of strong claims about the overall stock and where it's located.
A
What would be the resourcing of something like this? I'm sure a lot of money is spent to do all of the International Atomic Energy Agency inspections, the random monitoring, the, you know, the cameras, all of the things.
B
Yeah. So right now there's about 20 million AI chips in the world. This is growing fairly quickly, but right now the total stock is somewhere around 20 million. We did the modeling on this recently. In order to have 90% confidence that they're all where they expect them to be, you'd need to do around 10,000 inspections per year. And so for comparison, the IAEA in the nuclear space does about 3,000 inspections per year. So just like the super manual, super dumb version of this is comparable in scale to what we already do for nuclear, but you don't need to do manual inspections for everything. You can supplement it with technologies that already exist and that we can use. And so the nice thing about chips is they're not sort of dumb rocks like uranium. It's a device that's generally connected to the Internet and is intelligent and you sort of can communicate with it. So another form of doing an inspection is verifying the location of the chip using features that already exist on the chip. And so there's a technique that Nvidia has now implemented, and a number of companies are sort of starting to offer this as a service known as location verification, where essentially you send a ping to a chip over the Internet and it responds. And you can measure that round trip time to figure out how far away is this chip from the place where I'm sending the ping from, because that has an upper boundary that's governed by the speed of light.
A
Wow. And that's in a weird way, I mean, this is way better than we can do with uranium. We can't, like, send a ping to every uranium and then measure the microseconds. So there's actually certain things that are much harder about monitoring and verification for AI, but other things that might be easier because we can use digital tools differently.
B
Yeah, that's right.
A
I think there's always. There's a delicate thing in this conversation which is there's this place where I think people kind of like, say, oh my God, this is so hard, or is this ever going to happen? And I think there's a difference between something being fundamentally physically impossible versus just extraordinarily different and would require an enormous sort of, you know, oomph of effort and resources and coordination to make happen. And I think I hear you saying that it's the latter, but I want to be acknowledging the part, in many people's eyes in years of this being a very difficult challenge.
C
Can I just jump in here with. Just to again, link it back to the nuclear analogy. So this same question came up with the Comprehensive Nuclear Test Ban Treaty of, like, is this impossible or is it just technically difficult? And of course it was. Rand, a physicist from Rand and others, said that you couldn't definitively differentiate the seismic signals from an earthquake compared to a nuclear explosion underground, and that, like, really slowed down progress on pushing for banning nuclear tests for quite a while.
A
So the doubt about whether seismic monitoring could actually distinguish between an underground nuclear test and the rumbling around that hit that sensor versus an actual earthquake, the belief that we couldn't do that, you're saying, stalled progress on building out all the verification mechanisms that ultimately did work.
C
Luckily, the verification mechanisms got built out anyway, and then they were able to show that actually it does work, and they've detected all six of the North Korean nuclear tests since they come into effect. But I think this is another example of there's often times where the technical answer isn't ready to hand and that the science needs to advance further before we have really clear ways forward on how to manage some of these cases that aren't directly in line of sight, like the beyond frontier risks associated with verification. For me, I think that means that we just need to do more science and more exploration and then use the time that we might have to actually think about and dedicate a lot of compute resources and experiments to how can you actually model some of these risk factors? And are there different mechanisms that you can put in place that preserve privacy, that uphold democracy, but nevertheless lower the risk in general?
A
I think what you're saying is just all so important because it's just legitimizing the idea that if we don't think anything's possible, then we can actually contribute to the worst things happening. And the only way that it's even possible of getting to a better world is if we're actually working on them. I mean, if you go back to Robert oppenheimer in the 1960s who believed, oh, it's like it's impossible to basically prevent the spread of nuclear weapons, so you could say, okay, well then let's just maximize money until the world ends. So let's just sell people uranium and get American uranium all over the world, maximize GDP growth, get 10% boost in GDP, because we're just selling the whole world uranium, we're not controlling it. But then you basically accelerate directly into nuclear terrorism. I think what I see, and I'm curious if you agree, what I see happening in the tech industry is the first belief is this is inevitable, no one can stop it. Second of all, because it's inevitable, I'm not evil for kind of making it go faster or making it happen because there's nothing that could have been done to stop it. And so there's actually a really big thing here, which is the fundamental belief system whether something is worth trying here or not. That's kind of the deeper choice point.
C
Yeah, I completely agree. I think we just can't let the perfect get in the way of the good here. And not to again go back to the nuclear case study, but we had that. We had pretty bad non compliance, which led to improvements. So Iraq was a member of the non Proliferation Treaty. And then after the 1991 Gulf War, it was discovered that they had actually had a clandestine nuclear weapons program. The reason it wasn't discovered was that IAEA was only requiring reporting and inspections on declared sites, rather than broader visits and ongoing surveillance and testing of other sites, as well as when they had suspicions. And that just led to a protocol being updated. And now there's much stronger proactive discovery and enforcement by the iaea. And so for me, this is like every time someone says, like, here are the reasons this won't work, or here's all the ways it might fail. I think we're in an environment where the technology is moving so quickly and I think, how can we iterate quickly? How can we get some of the best minds to be prototyping some of these technologies and preparing for a wide range of futures? So that's what keeps me hopeful in this space.
B
I think it also highlights that there is a really strong role for industry to play here. I think unlike the nuclear weapons case, the AI case is one where it's largely private companies who are developing and deploying these technologies. And many of these private companies have indicated a lot of concern about the risks involved. So we Talked earlier about OpenAI Anthropic and Google DeepMind all talking about wanting the optionality to slow down frontier AI development to figure out alignment and figure out what to do to prepare society. And that's really promising to see. We're seeing less of that from the chip industry, but I think there's no reason why they can't be on board with this kind of project. And yeah, I think it's worth emphasizing that implementing the kinds of stuff that we've talked about today is really talking about implementing it for huge data centers owned by trillion dollar companies deployed in sort of like a relatively small number of sites across the world. This is not a sort of global regime focused on every single person's personal device and how they're using it. This is trying to verify what is being done with a relatively small number of computer chips compared to the total number of computer chips in the world for a specific application, which is frontier AI development. Yeah.
A
So just to reiterate, you're not talking about locking down everyone's MacBook and saying you have to get approval from a global government before you can turn on the computer and launch an app or write some code. You're talking about just adding this kind of monitoring and verification infrastructure for a handful of data centers with the frontier AI systems. But what I hear you fundamentally saying is, you know, the only way we get to this regime is we, we have imperfect solutions. We keep building from the imperfect. We see where the holes are, we see where the failure modes are, and then we keep going. This is so important. Are we seeing the labs themselves advocate for and spending money on to pass policies to move in this direction, to lobby Congress that this is what we need to do to, you know, get Nvidia to do something. Now Nvidia has its big massive lobby that doesn't want to be regulated at all and doesn't want to be forced to do any controls on their chips because it's going to, you know, slightly diminish the profit margins. Or maybe people in China or other countries don't want to be running chips that they know can be flipped on and off or have a location attestation. Can you speak to. We've now outlined enough of what could be a set of solutions that may be Imperfect, but are a set. And then what are the incentives at play that either push back against this or push towards it?
B
Yeah, maybe it's worth talking a little about what timeline that we're talking about here, because that's sort of very relevant for what the incentives are, what's actually possible in terms of technology development, and how quickly we need to act on all this. So if you talk to a lot of people at the Frontier Labs, they are saying that they expect to reach a level of AI system that can do recursive self improvement within 18 months. If true, that's extremely scary. We only have a small amount of time to act if you really want to come up with a verification regime and sort of an agreement that encompasses these kinds of systems. And the challenge there is if you look at the timeline for needing to make changes to hardware. So let's say, okay, we could set up a verification regime, but we need to do these design changes to AI chips to make it happen. That timeline of 18 months might be prohibitive. So to give you a sense, let's say I had a design ready to go that's like, here's the verifiable chip that everyone can use and we're all going to be safe and be able to trust each other. Design changes like that need to be locked like a year in advance before the chip is first manufactured. So that means my super verifiable chip is first going to go into production in about a year's time. Then it's going to be at least another year before that chip is manufactured at the kind of volumes required before it makes up like a majority of the world's compute. So we're looking at in total, sort of two years minimum from today, assuming that I had my design ready to go before that affordance is out in the world and sort of like allowing a verification regime to happen. So if we're looking at that kind of timeline, we really need to think about what can today's technologies and the technologies of six months from now do to support a kind of basic muddling through kind of verification regime. On the incentive side, I think where I would start is that we don't have commitments from the Frontier Labs yet, but we have statements that they're all worried about this. And these are all companies that have massive budgets. They could invest in the kind of technologies needed to kind of test this out. An initial use case that you can imagine is, let's say OpenAI and Anthropic want to do a mutual verification regime between them. Where they set up a data center, they run tests on those data centers, they train some models, they figure out how can we make claims to this other party using the kind of technologies we talked about today that then can create sort of a technical basis that allows governments to have more trust in relying on this technology in a more broad based way and sync up with efforts that are now already underway that we mentioned earlier between the US and China.
C
I'd agree with all that. I think one thing that is important here as well is I think broader international buy in matters. Yes, the US and China are the two AI superpowers of the world and they're the ones whose participation is really critical. But I think the diplomatic aspect of external experts also verifying and attesting that, hey, this is pretty good from a standpoint of verification, I think is really important. And I'll give an example. So in 2020, and this is all been publicly reported, that Australia China relations were pretty low after Australia called for an inquiry into the origins of COVID 19. And so China had reached for the classic economic coercion toolkit. And so it froze out a long list of Australian exports like timber, beef, lobster, all saying that there were technical reasons for this. So live lobsters, for example, were left to rot at the ports because they were waiting on custom checks because they had trace elements of metals and minerals. That was a food safety issue. So Australia came out on the record and said, hey, we've tested this, there aren't problems here, there's nothing to see, but it's just became a he said, she said, like testing it yourself doesn't really shift the dial. And the real resilience comes from when there's independent internationally backed regimes here, where it's much harder for like a single state to write its own truth that serves its own interests. So from my perspective, I think it's really important to keep advancing the science within US actors between the US and China. But I'm really excited about initiatives that bring in a broader set of stakeholders and experts that have less conflicted interests and can't be seen as, you know, the US trying to pull one over on China or China trying to pull one over on the us.
A
So this is so important and we have such short timelines, how much money and resources and what kinds of talent in the world is working on this. Like if this is so critical, you'd assume that there's millions and millions of dollars going into it, you know, thousands of people working on it. What is the current state of play
B
there yeah, so I don't have rigorous estimates of the total number of people working on this. I would say from a technical perspective, I'm pretty sure that I've interacted with basically everyone who's thinking about this problem and working on their engineering side, and it's definitely less than 50. So the size of this field needs to be massively expanded. I think that luckily there are lots of people in the world with the kind of expertise that's needed for this. There's lots of fundamental research that goes into things like testing whether a chip is actually tamper proof and what kind of attacks you can run on it to extract the private key that this cryptography that we talked about relies upon. There's many thousands of people in the world who work on these exact problems. And you could create sort of a massive workforce of people who are rigorously red teaming a verification prototype involving these chips and so figuring out what is the attack surface and how do we rapidly patch that. Obviously people who work in chip design and chip manufacturing and sort of like chip supply chains are really useful for this whole thing. If you're trying to think about how do we globally account for where the chips are going and set up proofs about where they're being manufactured and where they are and sort of like tracking those globally as well. I think that lots of this stuff is a good role for government, especially a lot of the fundamental research here. And there's already relevant research programs going on in the United States government, especially at places like darpa. But also I think the lion's share of this work is likely to be needed to be done by industry. And I see this being a combination of the frontier labs who have stated sort of an interest in this already, as well as chip companies who are going to be hopefully incentivized to build the kind of technology required if there is some sort of policy requirement for that. So I think a key role for government to play here is figuring out what incentives to create for industry to start investing in these kinds of technologies and figure out what's required and what
A
would be those incentive changes. Like if you had, you know, you were advising Congress right now and we could pass all the policies to incentivize this research in the ways that it would be needed. Do you have a sense of what those incentive changes would be?
C
There's some real low hanging fruit here. Location verification on chips is something that is easy to do. Nvidia already has solutions for it, and it's a very simple thing to switch on, which would Greatly reduce export enforcement costs for the US Government, increase enforcement, ensure that more chip exports can be approved, because there's verification that they aren't ending up smuggled into the wrong jurisdictions. And I think there's already legislation currently before Congress that thinks about this. So the Chip Security act is the key one here, Tim, you've probably got a range of them too.
B
Yeah. I would say that this idea of making location verification a requirement for chips that are exported overseas is part of this broader class of interventions that I find really promising that we've been calling conditional export controls. So for those who aren't familiar, the US Government currently has a whole bunch of authorities to regulate the exports of AI chips and manufacturing equipment. So it decides who is able to receive those chips. You need a license to do it and the terms under which they can do it. And currently the way this has been implemented is based on the performance specifications of the chip. So you can basically say if the chip is more powerful than this amount, it cannot go to somewhere like China.
C
You can.
B
Instead you could start thinking about this as more of an incentive. So you can kind of say, by default we won't export a chip if it's over this performance threshold, but if it has features that make that chip more governable or allow a verification regime to exist, then we'll allow them to go overseas. And the most obvious near term version of this is location tracking. So if your chip has a location verification feature on it, there should be a policy in place that makes it possible to export that chip to more countries because you've addressed that risk of diversion. There's a bunch of other incentives you could think about, but that's like something that is fully implementable today using authorities available to the US Government.
A
Amazing. And is the main reason that we're not doing this just because of the Nvidia lobby.
B
So Nvidia is certainly powerful. I think there's public reporting about them trying to get involved in both personnel and policy decisions that the Department of Commerce, who oversees export controls here is doing. And obviously they have a strong incentive to want to export as many chips as possible. I think Congress has pretty different views and often mixed views on this. But I would say the default position in Congress at the moment is they are very worried about foreign adversary countries getting access to AI chips and in generally pretty supportive of new export control measures. And I think the advantage of framing this in terms of conditional export controls is you have a release valve. You're not just blanket banning chips from going everywhere in the world. But you're offering a sort of way to increase or streamline export of chips if they are designed in a way that sort of gives us what we want, which is the ability to essentially prevent them from being misused.
A
Right. You know, here we are, the labs are saying there's 18 months to recursive self improvement. If you could imagine the perfect timeline for how humanity would proceed from where we are to land in a safe place over the next 18 months, could you just, you know, say a few lines about what would happen in the midterms, what would happen next in the US China relationship? We would pass the Chip Security Act. We would make sure that location attestation was on Nvidia chips. We'd have academics, you know, accelerate research in all computer science departments across the country. We would have AI labs and employees, you know, lobby their employers saying, we're not going to continue working for you until you use your power as an AI lab to actually, you know, advocate for these solutions. Like give us a taste of what would happen in that story if things were to go well.
C
Yeah, everything you said sounds great, actually. No, I think, I think key things that really stick out to me that will make the world go well is I think we need more US China engagement. Yes, we need to be advancing the science with scientists at home. But I think until there is that diplomatic channels to rebuild trust and to actually engage on the shared risks, we're going to be starting from a standing start and we need to be starting from a running start on these issues. We've seen what happens when one country manages risks in a way that doesn't actually uphold the needs of other countries. For example, the engagement around COVID 19 in the early days really, really left a lot to be desired. And I think we need to be engaging now on best practices, on actually sharing some of the information about the risks here and engaging on what does a meaningful way forward look like, where both actors are coming to the table and discussing these things at length.
B
Yeah, I'll say. I think one just critical precondition to make this all go well is having institutions within government who actually understand these technologies, tracking these risks and can update based on evidence and provide the appropriate information to. In the US like the White House, but sort of senior policymakers globally. Right now, in the United States, there is just one organization, the center for AI Standards Innovation or Casey, who is responsible with actually tracking the capabilities of frontier models, understanding the trajectory of that risk profile and then reporting to the rest of government on this. And that is an office with about 30 staff and a budget of about 15 million. We need to massively level up these kind of capabilities in order to provide policymakers with the information about whether and when they should work on verification and what kind of R and D programs are required, and all the kind of information that you need about what is your risk, what is your threat model, when should you act. Right now, this doesn't really exist in the United States, and the state of these kind of institutions globally is still fairly nascent. The UK has the most mature version of this at the moment, but I think there's massive room in the near term to just level up these kind of capabilities. And this is kind of like a no regrets move, right? The government should just be tracking and understanding this technology in order to figure out whether we should be making some of these decisions in technical investments that we've talked about today.
A
One other sort of thing I'd add to your list of interventions that I'd love to see is imagine at the UNGA conference coming up where all the world leaders are gathering in September, you had basically an obligate tabletop exercise where all the world leaders walk through what happens as countries escalate towards these crazier and crazier AI capabilities. It takes about two and a half to three hours. Daniel Cocatello, a former podcast guest, runs these. And people policymakers come to various conclusions that some kind of agreement or coordination with other actors in the world will need to happen to end up in a safer outcome.
C
Yeah, I think that really flags the stakes there, because we are. You just said racing, Chestan, and I think that's exactly right. The race dynamics are real, both between labs and then between countries. And it means that a lot of the negative externalities aren't incorporated into everyday risk management because there's such a need for speed. And so actually having the means to slow down if the lab's pushing for that or reach agreements about what isn't helpful, I think is really important.
A
This has been really inspiring. There's so many examples of how we can verify things that I think most people, if they just use their own basic intuition, they'd say, this is impossible. There's no way we could do it. It looked that way to everybody building nuclear weapons, and had we given up, we would have ended up in a different world or may not even be here today. And I think what you both are working on is so critical and so important. And Janet, we're just meeting and I just want to say, you know, when I met Tim I remember just saying, wow, I had no idea that people were working on this. And it was so inspiring that people had worked and thought so hard about this in the few years that we have to make this happen. And I know that it must feel lonely and there's so few people working on it. The amount of funding going in is so much less than it should be. But just thank you both for doing God's work in wanting to see that there's something possible here. Because if we don't live from that leap of faith that something might be possible, we'll never find the path. And you all are living role models and examples of that possibility. So thank you so much for coming on your undivided attention.
B
Thanks, that's too kind. And yeah, thanks for being willing to talk with us about what is unfortunately still a very niche topic. So it was really cool to chat this through with you.
C
Thank you. It was really great to be here.
A
So just imagine thinking back, you know, it was all looking so bleak in this moment when humanity was racing to these dangerous AI capabilities and it seemed like nothing could stop it. And then we woke up and realized that this wasn't inevitable. In the next meeting between the US and China, they did tabletop exercises so that policymakers on both sides could game out the AI race clearly. And that motivated work for international agreements. Just in the same way that Mythos raised the stakes and moving from inaction on verification to moving towards protective measures, safety conscious employees at AI companies who are now flush with cash from IPOs invested their personal money and resources into this. We started a Grand Verification challenge, a prize competition where AI safety institutes across the world and technical universities role played a treaty and even tried to break that treaty with without getting caught. New independent verification organizations tested all major models before release and cryptographically signed the model weights to attest that they had passed specific safety evaluations. And if a model was deployed without that fingerprint, the deployment was rejected. VC funds started accelerating, investing into hardware verification companies and projects like Lucid Computing and Flexible Hardware Enabled Governance. We started treating untracked dart compute like we did enriched uranium during the nuclear arms race. And ultimately it got us to a point where we could agree on common sense red lines or even slow down AI deployment. There's a common misconception that verification has to be perfect, but nothing in security is 100% perfect. It just has to be expensive enough to hack into so that very few people could afford to do it. Now, none of this is perfect, but we need to start somewhere and and we can make it better over time. Your Undivided Attention is produced by the center for Humane Technology. We're a nonprofit working to catalyze a humane future. Our Senior Producer is Julius Scott. Josh Lash is our researcher and producer, mixing on this episode by Jeff Sudeikin and original music by Ryan and Hayes Holliday. And a special thanks to the whole center for Humane Technology team for making this show show possible. You can find transcripts from our interviews, bonus content on our substack, and much more@humanetech.com and if you like this episode, we'd be truly grateful if you could rate us on Apple Podcasts or Spotify. It really does make a difference in helping others join this movement for a more humane future. And if you made it all the way here, let me give one more thank you to you for giving us your undivided attention.
Hosts: Tristan Harris, Aza Raskin (Center for Humane Technology)
Guests: Tim Fist (Director of Emerging Technology Policy, Institute for Progress), Janet Egan (Senior Fellow, Center for a New American Security, CNAS)
Release Date: June 18, 2026
This episode explores the urgent need for international treaties and verification mechanisms to ensure global coordination on AI safety—drawing historical parallels with nuclear non-proliferation—and lays out how building technical and diplomatic infrastructure now could enable effective, enforceable AI governance in a climate of low international trust.
Timestamps: 02:09–04:51
Timestamps: 05:54–08:23
Historical Analogues:
‘Trust but Verify’:
A phrase from a Russian proverb adopted during the nuclear era, now central for AI: “What does it actually mean when we apply this concept to AI? What are we trusting and what are we verifying?” — Tristan Harris (08:58)
Timestamps: 09:15–11:36
Unpredictable Futures:
“The only really certain thing about AI futures is that it’s very uncertain.” — Janet Egan (09:15)
Current Moment Is Unique:
US and China engaging diplomatically on AI (e.g. recent Trump-Xi Summit) around model ‘Mythos’ with cyber-offensive capabilities—a rare opportunity for practical coordination.
Timestamps: 11:57–19:48
Timestamps: 19:49–24:06
Accounting Is Key, but Hard:
Verification means keeping a global “ledger” of where high-power chips are—like uranium tracking for nuclear non-proliferation.
Retrofitting Data Centers:
Emerging efforts to bolt on tamper-proof, accountable modules to existing centers for “white box” observability.
Physical Surveillance:
Borrowing from nuclear practice: 24/7 camera monitoring, tamper seals, and random inspections are possible, especially since large data centers are easily located (for now).
Tamper-proofing Remains Challenging:
Still a R&D challenge to make inspection modules robust to sophisticated (even state-level) tampering.
Timestamps: 24:12–27:37
Timestamps: 27:38–32:05
Timestamps: 32:06–43:43
Timestamps: 43:44–47:40
Grounded in hard lessons from nuclear history, this conversation reframes the challenge of AI treaties as daunting but not impossible. There is a narrow but real window to apply lessons from the past, mobilize engineering and political will, and develop robust verification technology—building imperfect systems and iterating as risks and technical capabilities evolve.
The question is not whether perfect solutions exist, but whether we will choose to invest and act.
“There’s a common misconception that verification has to be perfect, but nothing in security is 100% perfect. It just has to be expensive enough to hack into so that very few people could afford to do it... Now, none of this is perfect, but we need to start somewhere and we can make it better over time.” — Tristan Harris (48:50)