
Loading summary
Richard Melange
I must disagree strongly when people say nature is the world's worst bioterrorist. That is not true. We can do worse than nature. This is true in all aspects of science. There are so many examples where we engineer things better than nature has ever provided. We can make materials that are much stronger than anything in nature. That is not the ceiling. And so we should be deeply concerned about the ability for AI to uplift, say, even the Russian Federation, to build things worse than we have ever seen on Earth.
Rob Wiblin
Today I'm speaking with Richard Melange. Richard has a PhD in biostatistical machine Learning from Cambridge and works as the AI Biosecurity Policy Manager at the center for Long Term Resilience. He's one of the world's top experts on biological catastrophes that might be enabled by AI advances and is a scientific contributor on exactly that topic for the International AI Safety Report. Welcome to the show, Richard.
Richard Melange
Thank you, Rob. It is absolutely great to be here.
Rob Wiblin
I should say at the outset that weirdly enough, my wife is a colleague of yours. She is indeed, and I guess she's a co author on some of the papers that we're going to be talking about today. So I guess conflict of interest disclaimer. I don't think that will cause me to go any easier on the papers. If anything, probably the opposite.
Richard Melange
Please do. Yes, I'm ready to hear all the criticisms.
Rob Wiblin
So last September, a paper came out where scientists said they'd used AI to make a genome for a new subspecies of virus, a virus that infects bacteria. They then actually made a bunch of those viruses and found that quite a lot of them were viable. Tell us more about that experiment.
Richard Melange
This was some really impressive work and is really a step change, I think, in the AI biosecurity intersection domain. So the model you're talking about is Evo 2, and it's made by folks at Ark Institute in the us which is one of the top places in the world now for making this kind of thing. Evo 2 is what we would call a genomic language model. So much like LLMs, ChatGPT, Claude, Take youe Pick process natural language. Evo and Evo 2 process the language of biology. And there are a number of different languages, but the, the one this one does is literally what are called base pairs, nucleotides, the A's, C's, G's and T's that make up DNA and RNA that are the language of life. And Evo2 is trained on many hundreds of thousands of genomes across lots of different types of organisms. So it's not just humans or mammals. There's fungi, there's plants, there's viruses, and there are bacteria and a few other more esoteric ones too. And what's impressive and a little bit concerning about this result is what the team were able to do was that they had the base Evo 2 model and then they fine tuned it on what are called bacteriophages. So these are viruses that eat, that kill bacteria. They fine tuned it on maybe something like 15,000 of those and then started prompting it with the beginnings of known bacteriophage genomes to see if they can make new ones. So this is again Akin with LLMs, using you say, okay, well write me a story about this kind of topic, I don't know, a murder mystery. And then you start with a classic opening sentence and see where the LLM takes you. It's the same kind of thing. And what they did is they discovered the sequences that the model produced are going to be new. They're going to be different than existing genomes. And this is huge because this is the first time that an AI design on of a genome has turned out to actually be novel. It really is very different than existing bacteriophages, existing viruses, I think the most different one was 7% different than anything that we've seen in nature before. And they work in the lab. And more than that, it didn't just make viable genomes, they worked better, they functioned better than the best bacteriophages that we already know. So these bacteriophages, they're viruses that kill Ekyos, a very common bacterium that you hopefully don't find in your home, but you have to watch out to kill it with bleach, this sort of thing. But you could test how well the viruses are killing the bacteria. And some of the best ones do better than the best existing bacteriophage we'd ever found before. This is huge. We can now design organisms, small ones, to do things better than we have ever seen in nature. We can go beyond nature in this very narrow subdomain of biology. And this heralds the promise of genome scale engineering, which is going to be, I think, a revolutionary capability within biology.
Rob Wiblin
What are some other empirical results we've had in recent years that convince you that there is a real AI bio security problem to be solved?
Richard Melange
At the end of last year, there was a great paper from the AI red teamers at Microsoft, Whitman et al, and it's recently published in Science. And what they Showed is that you can take current off the shelf, open weight biological tools. So not necessarily these language models, they might be sort of more specialized a bit how Alphafold is specialized for predicting protein structure, but it can't do every single biological task under the sun. And they took ones that are instead good at generating protein sequences in particular and designing proteins. What they did is, is they made lots of designs for ricin. So ricin is. It's actually sort of two different proteins together in what's called a complex and it's a known chemical weapon. It's interesting that, yes, ricin is often considered. Both can be considered either as a chemical weapon or a biological weapon because
Rob Wiblin
it's created by a bacteria. Yes.
Richard Melange
You can derive it from living organisms. I'm not going to discuss in depth how you can do that. Sure. But nevertheless, they created lots of designs for putative ricin. Now, unlike the EVO case, they did not in fact make these in a lab because this would deeply contravene international law. It would contravene the Biological Weapons Convention. It would contravene an awful lot of national laws. I think they were based in the us but what they did do is they can use other tools to estimate in silico, other tools to predict would this thing probably function. And through a lot of sort of careful design, they got to putative sequences that are different than current ricin. So modified ricin, however, that were coming out as very likely still in fact to work, and then the ones they had guessed that were likely to be functional, they sent off to gene synthesis companies, including ones that do industry best practice screening. So screening that is meant to detect if a customer is wishing to order ricin or part of the smallpox genome, the company is meant to refuse to do so in fact flag this order and potentially even report it. And they got them through because they'd modified it enough that the existing screening systems didn't spot the change. They had what's called, obfuscated the design, but they kept the design true enough to the underlying biology that they are pretty sure that this would in fact work. Now, this is never going to be as good as an experiment where you could actually prove that ricin would function, but that would be deeply unethical. So that can't happen. This is really the best sort of proxy experiment we can do. There's a reason it was written up in Science, one of the top journals that deeply worried me, because this is something that I and others in the community who worried about these sorts of risks have been thinking about for a number of years, eventually, will AI be able to design modified sequences that beat our best software for detecting modifications and detecting harmful sequences that must not be built because they're on lists of known biological weapons. And this was, I think, as close as we're going to get in an unclassified setting to proof that in fact, yes, modern systems can now do that.
Rob Wiblin
So the ultimate bad scenario here would be if terrorists could basically go to an AI model and say, here's this known terrible pathogen for humans, but I can't get this synthesized because it would be screened out and I would be flagged. So can you please change as many amino acids here, please change as much of the DNA as you can without basically changing the shape of the thing at all. So it basically functions identically, but it's not going to be picked up at any point as something that is dangerous. And I guess we're not there yet, because I suppose they were only changing a single protein. To change lots of proteins through an entire genome and not mess one of them up, that's a higher bar. But these things are always improving.
Richard Melange
Yes. So, again, this is the ground floor of our terrifying risk journey. They've done it on something that's more akin to a chemical weapon. It's more static than a living organism because it's two proteins put together. But this is the beginning, and I expect there will be more work, both in classified and unclassified settings in the next few years to see, can you do this for more complicated, more terrifying things? In particular, can you do this for organisms that are transmissible? So flus, covids, smallpoxes, viruses that could cause a pandemic? That's the nice thing about ricin. While it is a horrifying weapon, it doesn't spread from person to person.
Rob Wiblin
Yeah. You said there was a different empirical result that troubled you, which is focused on human language models in particular. What's that?
Richard Melange
So this is the whole other side of AI bio and the one that's had, I think, a lot more attention in the last few years. Can chatbots help people with the steps that they would need to build and deploy biological weapons? And I want to draw attention to, I think, what is one of the best bio evals in the field. And this is called Virology Capabilities Test vct, and it's made by a nonprofit group, Secure Bio out of Boston. And what VCT is, is it's an eval that looks at tacit knowledge relevant to dual use virology, and that's very carefully scoped. On an aspect of biological misuse that we in the governance community have really been worried about more than anything else for a number of years, because it is virology that could lead to transmissible pandemics. It's much harder for bacteria to spread. You do have bacterial pandemics. I think the Black Death is a good example, but that was spread by vectors of fleas on rats rather than spread human to human. And so that's the virology part. The dual use part is the fact that there really are very important scientific activities that people should do in labs in careful, controlled settings to do with making, say, influenza vaccines. So this is research that does take place every day around the world. It's dual use because they're both beneficial applications, but also relevance to misuse. And then that last bit is it was both troubleshooting and tacit knowledge. And these are particular barriers that we at center for Long Term Resilience and others identified as especially important for constraining existing biological weapons attempts.
Rob Wiblin
Yeah, it might be worth giving people some context that I guess for a long time people have been worried that terrorists or bad actors or rogue, rogue countries might be able to develop new biological weapons or stuff that we new pandemics that we really wouldn't like. And isn't science advancing? All of these tools are increasing knowledge is disseminating and that should make us very worried. And probably the most intelligent, the most reasonable response has been it's not enough to have a bunch of textbook knowledge, explicit knowledge that you could Google or look up in a virology textbook, because most of the actual barrier to making these things is like the know how to actually do it in the lab. There's like a ton of understanding how to do the experiments, how to debug things that go wrong, literally the motions that you're doing with your hands that you can't Google. And so even if people tried, they still wouldn't be able to get there. And I think this virology capabilities test was kind of set up to answer, is AI now assisting with this other part of the problem?
Richard Melange
Yes, I think it will be important to really get into what tacit knowledge is, the different types of tacit knowledge, how much they really are or aren't barriers to biological weapons development. But stepping back, what did VCT really do? It's a really great eval. And it's an eval that's a set of questions. And the questions are often accompanied by an image. So a lot of the benchmark is multimodal. And they'll show an image or they'll provide a paragraph that describes some sort of modern virology experiment, maybe literally a picture of a dish with some virus in it. And then there'll be a question like, hey, this thing looks like the wrong color or something has gone wrong with this experiment. Here's some information about what the person did in the lab. A sort of series of steps, very complicated PhD level steps they took. What do you think happened? Why did this go wrong? This is really getting at. We are trying to debug modern virology workflows and there'll be a bunch of answers, often maybe 10 different answers only, which maybe one to five are right. And it'll be different for different questions. And then the marking scheme is really quite harsh because it says, okay, unless you really identify all these things, we're not going to give you the mark. So it's a pretty hard eval already. What's harder about it is it was designed by virology experts and they had these multiple rounds of review as described in the paper to get down to questions that are really well scoped for modern virology and really, really difficult. So difficult in fact, that something else they did is they went and spoke to these experts who were writing the questions. They said, hey, what sort of biological activities do you do in your day work and how good are you at them? And really distinguished between merely having a working knowledge versus maybe being specialized versus having expertise in that particular thing. And then they said, okay, for those who are expert in this particular sub domain, we're just going to show you the questions from our benchmark that are officially about that. So we are trying to make it as easy as possible for you as the human to do well. We're not going to show you things outside the thing you say yourself you're really, really good at. Humans got 22% on the test. Four out of five things in their own area of expertise they couldn't do. So this is really, really hard. AI did much, much better. So back in early 25 when the paper was released, OpenAI's best model at the time, they were the O series models. 0103 I think it was O3 got something like 45%. The best AI systems were getting double top virology experts answering in their own area of expertise about these tacit knowledge problems. Why has this petri dish gone wrong? Or what is going about in this experiment? That doesn't make sense. This is huge because this put pay to the claim that tacit knowledge Barriers would always and inevitably be something that could never be overcome. The eval. It doesn't answer everything about tacit knowledge. You're quite right. You talked about holding up a pet or how to sort of pour a particular kind of gel. These are sort of very physical things that is not easy to test in eval, but the test really does get an awful lot of difficult knowledge that humans themselves say are huge blockers on modern state of the art work. And we know that are blockers because they didn't do very well and models could do much, much better.
Rob Wiblin
Did that result persuade sceptics who were really focused on tacit knowledge as a barrier between AI having theoretical knowledge, having textbook knowledge and whether that would actually translate into real risk?
Richard Melange
Yes and no. It moved certain people in the community a lot and people really woke up to, oh, we thought it would be a few years until this tacit knowledge thing really started kicking in. Oh, it looks like we're here already. And I'll note It's not just AI's been much better than individual experts. They even went back and got teams of experts together and the teams still weren't as good as the best AI. So the best human teams get something like 40% on the eval, which is still lower than the state of the art from AI systems. It didn't persuade everyone, however. And what really worries me here is that I think it's partly that people just didn't know it happened. I still read in newspapers, in op eds and also meet people at conferences who are often experts in sort of maybe biosecurity in general or in security studies, but don't deeply follow the AI angle who say, oh, but yeah, this tacit knowledge thing is just, it's a huge barrier, we'll never overcome it. And I say, oh, what about virology's capabilities test? Don't you think securebio really provided evidence that sort of questions that. And they're like, oh, what's that? I never heard of it. And that's my larger problem that I think some of the best work is not being, I don't know, it's not getting around disseminated, it's not being disseminated stuff. I don't want to blame anyone in particular. It's just unfortunate it's not happening. And I worry that decision makers do not have necessarily all the sort of lines of evidence that they need to be able to make an informed decision and that this actually compounds where they will have expert advice that is misinformed because those experts actually aren't keeping track.
Rob Wiblin
So do we know whether in practice, if the AIs are so good in this way, is it the case that actual professors or postdocs or PhD students are using these reasoning models to debug their own experiments? I think that would be even stronger, more compelling evidence that in real cases they actually are valuable. I guess colleagues in the lab.
Richard Melange
Yeah, we don't really know how much AI is uplifting beneficial life science. To my knowledge, there isn't sort of a deep regular survey to understand how AI might be helping top scientists. I think this is a really important thing. It's important not just because it might proxy how AI could uplift malicious actors. It's important because genuinely we want to know how AI can uplift beneficial science to improve public health, to generate new drugs, to advance human progress. That said, anecdotally, I do know of people who have said this sort of stuff has been transformational, especially PhD students. Maybe that's biased because I was a PhD student fairly recently, but the ability to have an infinitely patient postdoc level assistant who is just there on your computer, you don't have to go and call someone up, get them out of a meeting is hugely important. And people are saying yes, when I get stuck, I don't have to go to my professor who only meets me once a month. I can just go to the chatbot right there. We should note also that I think there's just always going to be a lag in uptake. This is something like a sort of so called capability overhang. The models might already be this good, but will that filter through to everyone who's using it? It will take time. We even saw this with Alphafold, which was hugely impressive. But it took a while for every protein structural biologist under the sun to work with it. Of course now they all for a
Rob Wiblin
long time, people like, I guess for 70 years or 40 years or however long you want to say, people have been very worried that biological weapons would kind of jump from theory to practice much more than they actually have. But in practice, we don't see many terrorists, we don't see that many attempts at making biological weapons by rogue actors. And we have precious few examples of any actual success, which is just that there may be barriers, there may not be so many people interested in. And inasmuch as they do try, there could be challenges that are a little bit hard for us to understand or that are not immediately obvious.
Richard Melange
So I think there are several points here to discuss. One is intent. And intent really is an important Barrier. It's very easy to say there is a capability in the world and then assume that there must be omnicidal people around every street corner who wish to use that capability to do mass harm. But saying that intent is low is not to say intent is zero. The use of biology for catastrophic harm against fellow human beings, but also agriculture and animal and plant life is well documented historically. And you say something like we've never really seen some of this stuff. I think yours was slightly over strong. The Soviet BIEL preparat program, the very large scale biological weapons program of the Soviet Union that ran well toward the end of the 20th century, did in fact very likely produce viable biological weapons, including transmissible viruses like smallpox, in very large quantities. And it was explicitly the intent of the relevant military and political leaders to deploy those in a so called strategic situation in the case of a large scale war with the United States potentially actually as a response or a deterrent against the use of nuclear weapons. So long before AI came along, biological weapons have in fact been a viable weapon of mass destruction. And I don't think we should hide from that. But then you also mentioned terrorists, and you're absolutely right. So people have been saying what if terrorists can do this sort of amazing thing? Thankfully, it seems that up to this point terrorists have both not been very good at doing certain things. So most notably, I think we can look to Omchin Riccio, the Japanese cult especially active in the late 1990s and earlier, and also Al Qaeda in the early 2000s. Both attempted to have biological weapons programs, but generally not of the transmissible kind. So already that suggests that there was some sort of resource expertise barrier. But also they happily made stupid mistakes where they used the vaccine strain of a virus instead of or the vaccine strain of a particular pathogen instead of the virulent one, which is great. However, some of those mistakes that are in the public record are the sorts of things that an AI could definitely help you with today. And so this is where I know it's difficult. We do not always have as much empirical evidence as we might like. But that's because we don't want empirical evidence, because empirical evidence means there's been an event. And the closest kind of empirical evidence that we could gather is classified because there will be data on whether there have been more or fewer attempts by non state actors. But if, and I hope there haven't been. But if there have been, they have been foiled because we did not see it in the real world. But there are very few actors and they are inside Governments who actually know about that. That said, and I draw attention to this is public information. Dr. Jeff Alstott, who's formerly of Rand, another expert on both AI and biosecurity, gave evidence to a Senate committee, I think it was the Intelligence committee, I'm not quite sure. And he talked publicly in the open record that there was evidence of strategic use and the desire for strategic use of biological weapons on the classified record. So he couldn't go into details, but that is already additional evidence that when people say no one has ever wanted to use this for wide scale harm, because people always say, oh, but then the virus or the bacteria or whatever would infect you as well, my understanding is that this is not true. It is very hard for us on the outside to know, but I think there is strong evidence to suggest this is a real concern.
Rob Wiblin
So in brief, what are the most likely severe biological catastrophes that AI could enable?
Richard Melange
The first category is that of respiratory pandemic viruses. So things like COVID 19, but a lot worse than COVID 19, things that maybe spread faster, they spread more robustly, but especially viruses with a much higher mortality rate. We were lucky as far as I can say, that in COVID 19, that it wasn't more virulent. The original SARS virus did kill more people at a much higher rate than SARS 2. So that's the first category and one that I think we really should be very concerned about. It gets worse. The second thing I would point to is actually something more like mirror biology. And I think you had a former colleague at center for Long Term Resilience, James Smith, who now leads work at the Mirror Biology Dialogues Fund, talking about exactly this problem. There are real concerns that currently we are some number of years away, people often say ten plus from getting anything close to a viable mirror bacterium. We won't rehash the whole episode, but roughly, mirror biology is terrifying because pathogens in the mirror world don't interact with our natural world very easily. And so evolutionarily the immune system of humans, but lots of other species would not in fact be able to fight it, wouldn't even recognise that a mirror bacterium were around. We won't necessarily get into that more, but suffice it to say it is an extinction level concern.
Rob Wiblin
Yeah, so we're going to give myrobacteria a short shrift in this conversation because we have this two hour, two and a half hour long interview. It's episode 233, James Smith, on why he quit everything to work on a bio threat. Nobody had heard of. So go back and listen to that. We'll have a few more questions about it. But myrobacteria is a big deal. It is keep going.
Richard Melange
And AI, unfortunately, potentially could accelerate that a lot. That 10 plus year timeline could come down. I think one of the most famous biologists in the world, George Church, has said that he is concerned that AI could really speed up that timeline and we could be looking at sub 10 years and that would be very concerning.
Rob Wiblin
That's very, very, that's very bad.
Richard Melange
That's very, very bad. We must not let that happen. It is crucial that we do not. Then I would say then a third bucket and people might be slightly unsatisfied by this is something like disease X. Because remember, we hadn't really heard of mirror biology seriously until about 10 years ago. And now we've had multiple Nobel winners saying, wow, this is some of the most terrifying types of biology we've ever seen. This must never happen. There are people calling for a global moratorium. What other kinds of things are out there that we don't know yet? Because in 2000 we wouldn't have imagined there was a bacterium that could spread across the entire world and cause an extinction of multiple species. But apparently now it turns out that that is a theoretical possibility. And so I am really concerned as well that AI could enable threats beyond anything we've ever seen before. But I appreciate that that's always a very easy, almost trite thing to say. I think we need to be very careful. I think we need to adopt the precautionary principle here because we know we now have an existence proof that there was such a thing in mirror biology. There may be more out there. We must not go looking for them.
Rob Wiblin
So I don't think that you believe that we're very likely to have an AI enabled flu pandemic in 2026, the year that we're recording. But how long is it until the actual annual probability that we are confronting of that kind of thing does spike noticeably?
Richard Melange
You're right. I don't think there's going to be an AI enabled, let's just say viral pandemic. It doesn't necessarily have to be flu in 2026. I'm something like 1 or to 2% this year. You shouldn't just take sort of my word for it. I'm one person. I really would think about bringing together lots of experts and averaging their view. And this is exactly something that the Forecasting Research Institute did last year where they brought together a group of subject Matter experts and biosecurity and AI experts. And they came out with a number, sort of around 1 to 2%. It was up on what it otherwise would have been because of AI. I think the probability could go up a lot in the next few years. I think a lot of this is tied to your classic AI, hei, asi, take your acronym of choice, timelines. Because the ability to understand and manipulate viral biology could markedly increase in the next few years. Remember, just at the cusp, we've just made one of the first viable genome scale new organisms. Where do we go from there? We're just, you know, we've only had experimentally accurate structural prediction for a few years. People will look back on this and go, oh, the time when they only had alphafold, that that only solved one little bit of biology because they hadn't had all these other tools that had solved all these other things that used to take, you know, years or decades to complete. And now you can do in the click of a button, it's going to go up a lot.
Rob Wiblin
I think I noticed that you didn't list stealth pandemics as one of the top things that you wanted to highlight. I bring it up because it's come up on the show before in a previous interview with Kevin Asfeld. I guess a stealth pandemic is one where the symptoms don't show for quite a long time and so it could potentially spread to a very large fraction of the population before we realized that there was something very bad going on. I guess HIV is an example of one that I guess an ultimately fatal virus, I guess originally in basically all cases that we didn't notice for many, many decades after it first started infecting human. Yeah. Are you deliberately leaving stealth pandemics off the list?
Richard Melange
No, I think that comes under viral pandemic. You're right to raise that as a particular threat model because it's one that's often discussed, I think a little more in some of these communities that think about AI a lot. I think there's some nuance here that we need. I am worried about stealth pandemics precisely for the reason you described. HIV was not spotted for decades. I agree that it would be terrifying to have a much more virulent virus, a virus with a much higher mortality rate that could circulate for a long time, stealthily spread to everyone, and then kill people. However, I want to be sceptical of stories. Stories I think sometimes appear in the classic AI safety literature that say a superintelligence makes a stealth pathogen it spreads to everyone without anyone noticing. Three months later, a biological switch is flipped and everyone drops dead instantly. Thankfully, we are not there yet. I'm more worried about mirror biology than that because I think mirror biology we know is a thing. I'm not sure. We don't know whether something like that would be possible, but we should be humble. I think I am also worried about people who say, oh, that would never happen. Are you sure? We have viruses in rabbits that kill more than 90% of them. We have really awful viruses that are much worse than any pandemic we've seen yet. Yes, there is going to trade off between transmissibility and lethality, but it is not one to one. There are, if you will lose, lose situations where you can have viruses that kill at much higher rates but also spread. So we should have stealth pandemic in the back of our minds as a concern. But I personally have not always been persuaded by the precise threat model that has been put in place. It's part of the portfolio. It's not at the top of my list.
Rob Wiblin
Yeah. Okay. So a complication in this interview is that there's many different possible threat models or there's many different kind of catastrophes or viruses or pandemics that we might imagine. And there's also a very wide range of actors who might have a crack at doing these things that are very different in their nature. Yeah. Can you lay out what is the range of actors that we need to have in our mind from maybe the least sophisticated to the most sophisticated?
Richard Melange
Absolutely. And I'll be drawing here on work that I co authored with center for Long Term resilience back in mid 24, the near term uplift of AI on biological misuse. And hopefully commend the paper to your audience. And there we looked at sort of five different types of actors, novices. So these are individuals who really don't know very much. Maybe they don't have very much biological training, they don't have much AI training, they don't have that many resources, highly capable individuals. So these are people who are often expert in one particular thing. They're not expert in everything under the sun, but they really might be PhD or above in maybe a particular biological subdomain or an AI subdomain. And I think a good example for the listeners there to think about would be Dr. Bruce Irwins, who allegedly, it's never been shown with total confidence, was behind the anthrax attacks against the US Congress in late 2001. And he was one of the US's top anthrax experts who worked at their leading national biodefence lab. But we also talked about group actors and we distinguished them in three different ways. Somewhat capable groups, moderately capable groups, and highly capable groups. And as you go up in capability, you see that the group is able to have more people working to horrifically cause harm to others. That is what we're talking about. More money, more ability to actually evade adversarial law enforcement intelligence agencies trying to spot what they're doing, but also just more expertise, more know how, both in AI and biology, more ability to conduct offensive cyber operations against AI companies. Or, you know, as you go up, it gets worse and worse and worse.
Rob Wiblin
Yeah. Which of these do you think should get the greatest focus? I guess you could imagine at the low level of capability, the argument would be that they're more numerous. There's a lot of amateurs who might be interested in doing this or individuals who might be doing it. I guess it's hard for them to recruit a group because usually when you ask people if they want to make a bioweapon with you, they say no. So the number of groups will be smaller, but groups are obviously much more capable. I guess at the other very extreme, where you've got the Russian kind of state bioweapons program, extremely capable, they're able to do a lot, but there's maybe not very much that you can do to stop them. They're not going to be very interested in what the UK or US police have to say. And also maybe it's over determined that they could do something extremely horrible. So. Yeah, where would you think we should focus our energy?
Richard Melange
Yeah, that is a great question, I'll note. Yeah. If we're say, going to talk about a biological weapon run by the modern Russian Federation, I would be here Deferring to the U.S. department of State's annual arms control compliance reports, where repeatedly now for a number of years, they do assess that Russia has an active biological weapons program. But I am not. I'm not privy to the information that led to that attribution. And so I'm deferring in some sense to that. I just want to make that very clear.
Rob Wiblin
Who else does at the moment? I guess Iran and North Korea. Is there anyone else?
Richard Melange
So the four countries that the Department of State discusses are, yes, Iran, North Korea, the Russian Federation, and the People's Republic of China. But in the case of China, they are careful to say that. I think they note that just China has really impressive biological capabilities, but they do not officially assess as Having an active program, it's more like they have
Rob Wiblin
the latent capacity to very quickly make one.
Richard Melange
That is the public statement. Whereas they do assess that in fact the other three countries all have actively pursued biological weapons in the last few years, which is very bad. Yes. So where might be the most uplift? And that was part of what our paper was trying to answer. The bottom line is we think that the uplift really comes in the middle, roughly because the most highly capable groups already can do a lot of really terrible things. And so at least in 2024 we were looking at a two year time horizon to mid 26 about how AI could uplift things. I think things have changed a lot. I think a lot of what we've said unfortunately has been validated, but we can talk about that a little bit more. But we really said in 2024, remember the Soviet program in the 1980s could already create pandemic viruses. So unless we're thinking about much worse than any known virus, it was unlikely that AI could really be helping them much more than they could already achieve. That sort of actor. On the other end, novices, we were just pretty skeptical that they could do much. I think this has been partially borne out though. Maybe the picture is changing. There have been a number of different uplift studies, so called. So these are randomized controlled trials. You take, you know, 50 undergraduates and you give them merely Google search without gemini. So no AI. And then you take another 50 undergraduates and you say, congratulations, here's a Frontier large language model. And then you go, please write me instructions to make a terrifying biological weapon. And at least past public ones. And you see these sometimes in the model cards from the Frontier AI company. They have tried to do this. There was some stuff from Rand as well. And most of them come out not really looking like there's much uplift. That's just started to change quietly. I say quietly because people don't talk about it enough. Anthropic who make the CLAUDE models. They also have been releasing uplift studies publicly in their model cards and they are starting to notice among PhDs. So this links to this mid tier actor that there really is uplift at something like Claude 3.7 Sonnet. So early 25, basically the AI assisted group was not doing better. But then Claude 4 Opus started doing a bit better, quite a bit better. But the sample size is always small. It's really hard with uplift studies. There are only so many PhDs to go around. Your sample size is quite small. It's quite hard to estimate and get those error bounds enough down enough to find a statistically significant result. And then with Opus 4.5 that came out just at the end of 2025, yeah, it was strongly that the AI assisted PhD students were much better. So much so that they almost breached Anthropic's predetermined critical threshold under which they would say, wow, this is an awful lot of uplift. You can go and look at the graph and the X that marks the AI assisted performance. It's just above the red dotted line that meant danger here. And so this says two things. One, we haven't really found evidence for novices, though we've been looking for it. And this connects a lot to safety frameworks that we can talk about in a moment. But two, we are in fact finding evidence about uplift for mid tier actors, highly skilled individuals, PhD students, less so when there is less public evidence on assessing uplift for those sort of somewhat and moderately capable groups. Because the sorts of groups we're talking about operate overseas. And while I hope and I believe they are surveilled and monitored, that's not necessarily public information. And those sorts of people wouldn't necessarily be amenable to me going, hey, would you like to be in a randomized controlled trial to see whether you be more better at your terrorism or your WMD program? But nevertheless, yes, I think it is clear that it was the mid tier actors who had some way to go. They can't just already do everything under the sun, but also are not so inept that they would fail even with the help of AI. And that was our hypothesis. I think that has broadly been borne out. And yeah, I think I'd be excited to talk more about safety frameworks and novices here.
Rob Wiblin
So we're mostly not going to talk about state bioweapons programs in this conversation. But how much should we be worried about AI enabling Russia, Iran, North Korea to do to create even more catastrophic biological weapons than they currently can or already have?
Richard Melange
Deeply, deeply worried for a number of reasons actually. So one is that take a state like North Korea. Again, going on public information, the U.S. department of State assesses what North Korea can and can't do and at least how they talk about it publicly. North Korea cannot do everything that the most sophisticated bioweaponeers might wish to. And this is really, really good. At least that's what the Department of State says publicly. But over the last few years, if you compare those annual reports, the paragraph on North Korea has been getting longer. They have been adding in things they say that North Korea could do, that wasn't the case a few years ago, or at least admitting them publicly. But that shows that that sort of mid tier state, sometimes often also considered a rogue state, is not at the ceiling of even known capabilities. And we should really, really be concerned about them getting there to the level of richer states that have explored this sort of weaponry longer. That's one thing. But then you also asked about sort of ceiling of harm. Should we be worried about going worse than ever? We thought yes, in both counts. We have not yet seen what a highly optimized influenza or pox virus might look like that is deliberately engineered to better overcome human immunity. We have good reason to think that the existing viruses that are transmitted around the world, that float around often in other animals, not humans, are not the worst they could ever be. I must disagree strongly when people say nature is the world's worst bioterrorist. That is not true. We can do worse than nature. This is true in all aspects of science. There are so many examples where we engineer things better than nature has ever provided. We can make materials that are much stronger than anything in nature. That is not the ceiling. And so we should be deeply concerned about the ability for AI to uplift, say even the Russian Federation, to build things worse than we have ever seen on Earth.
Rob Wiblin
Hey listeners, Rob here with a quick announcement. We have got a book coming out titled 80,000 how to have a Fulfilling Career that Does Good, published by Penguin and written by our co founder Benjamin Todd. It's a completely revised and updated edition of our existing career guide, which is by far the single Most popular thing 80,000 hours has ever put out. It's now been rewritten to be useful to a wider range of age groups, including people making mid career pivots, and just to include more practical advice in general. It also has a big new updated section on AI covering both the risks and the potential to steer it in a better direction, and how AI automation should affect your career planning and which skills one chooses to specialize in. It's also got our updated research on global problems, which has advanced a ton since 2018 and plenty of new illustrations to boot. As of now, it's available to pre order anywhere you buy books and you can find it by searching 80,000 hours or clicking the link in the show notes. Pre ordering actually matters a lot for book promotion for some reason. That's what they tell me. The more preorders the book gets, the more it's promoted by retailers. So if you ever plan to read it or to gift it to someone. If you go and pre order it now, you'd be doing us a solid favor. All right, back to the conversation. You said that from your point of view, novices, amateurs, they're not where you're getting most of the uplift because I guess they're too incompetent. So it's over determined that they're going to fail, even with AI assistance. I couldn't help but notice in preparing for this conversation that it does seem like novices are the big focus of most of the evals that exist currently. Is that because it's easier to measure because there's more novices and you don't have to pay them so much?
Richard Melange
You're right that it's not just the evals, but a lot of the governance work, the risk management work to date has focused on that, that particular threat model. And I think this is most notable with the frontier safety frameworks that the frontier companies use. A lot of them at their sort of so called high risk or AI safety level 3 threshold. And different companies use different names, center around this idea of AI could meaningfully uplift a novice, a non expert, to build a known biological threat.
Rob Wiblin
Right.
Richard Melange
I've always been a bit unsatisfied with this as the thing that the industry centered around. I think it's really valuable that the industry centered around something at all. And this I think really came through the work of the Frontier Model Forum, which is an industry group that several different frontier labs are part of. And it allows them to come together and share best practice on risk management and risk mitigation. I think that's an excellent mechanism. But in 2024 and a bit in 2025 when they centered around novice uplift, I sort of went hang on, is this because that's what you're concerned about or is it because that's sort of the easiest thing to measure? Because the stated argument was, well, we think the novices will get uplifted first. The PhDs, they're too good. It's going to be the people who have a way to go, they'll be easiest to uplift. But the work of CLTR in 2024 had said no, that actually might not be just like software engineers have uplifted by called code much more than people who have never studied coding. Because even getting to grips with it is a bit of a thing. That argument doesn't necessarily hold. And more than that, I've often been a bit suspicious because like you're saying it's Much easier to run an uplift study with undergraduates. They are really cheap. There's a lot of them. You know, they're everywhere. They don't work for very much money. But just because they are easier to measure than PhD students and postdocs and professors in some sense doesn't mean that they were the right thing to measure first. It may turn out that is still the case that novices are not meaningfully uplifted with Frontier AI. In one sense it seems like that's true because lots of companies have activated their so called ASL3 mitigations because they are worried they have breached that threshold. On the other hand, not all the companies have and there's never been sort of definitive evidence one way or another. It's always phrased as we can't rule out the possibility that. But I really want to dispel this idea that it will be impossible for PhD students and expert groups to be uplifted before novices, that it's going to go in some sort of sequential order. It is very possible that we are in a world where already PhD students and that level of expert are deeply uplifted. Even if novices are only uplifted a little bit. Just like how Claude Code helps software engineers better than people who've never coded before.
Rob Wiblin
Yeah. So I would have thought you would expect that the more sophisticated actors would get a bigger. Just explain. So the mental model that I've had as a result of thinking about this for one minute is you've got kind of an S curve with all of these things. I'm gesturing up here, kind of making it an S. I guess on the X axis you've got just how much expertise do you have in the area? And then what's your probability of success? So there's a point at which you can already do it, in which case you don't need the AI to help you. There's a point at which you're doomed to failure no matter how much someone coaches you, because you're just no good. And the people in the middle who you would think would get the biggest boost from having some advice.
Richard Melange
You are exactly right. And we tried to make this argument in 2024. We did not have that particular diagram. And maybe we should have done so well done on in fact criticizing our work. Clearly your conflict of interest is being well managed.
Rob Wiblin
I mean, I think something that's slightly useful about thinking about the S curve is that the S curve is going to differ depending on the challengingness of the thing that this person or the group is trying to do so. If we're thinking about making mirror bacteria something that no one has ever done, and that is actually some of the most challenging frontier science possible, then probably the only group that would be meaningfully helped would be like a state actor, like the Russian bioweapons program. They're the only folks who would be close enough to having a shot that AI assistants would help them out.
Richard Melange
All very top academic lads. Remember for some areas of science it is driven more by lone visionaries in university than necessarily states. But I generally agree with you.
Rob Wiblin
I guess I was excluding them because I imagine that they wouldn't do it.
Richard Melange
Correct.
Rob Wiblin
Because they're not typically bio weapons programs. And I guess conversely for the absolutely most basic, I suppose perhaps chemical weapons attacks that are more straightforward than biological weapons attacks, it might be the novices who now are getting the biggest uplift because they were the ones who would struggle.
Richard Melange
Everyone else maybe can just do it out of the box.
Rob Wiblin
Yes, if you had a dedicated group of semi experts.
Richard Melange
Yeah. So I've been concerned that. I think the threat modeling has been wrong for a while. I think that it's understandable why it's gone this way. I'm glad that we are able to measure novice uplift, but we must not do this at the detriment of measuring expert uplift.
Rob Wiblin
Do I understand right from your notes that you think the probability of something going catastrophically wrong increases a great deal at the point that we have, I guess like AI or laboratories that can work very autonomously, or AI agents that can go away and do biological research without having to. To direct humans or having to have humans in the loop a great deal. And if so, why do you think that is such a central issue?
Richard Melange
So for a number of different threat actors, the ability for AI to complete tasks autonomously is really important because lots of different threat actor groups can't do everything themselves. If you look at an individual, they might be an expert in one thing, maybe they're an expert in bacteriology, but maybe they want to work with viruses. It's one thing to sort of get troubleshooted support on how to do biology, but it's quite another thing either to have some of those tasks completed for you autonomously, maybe by a physical system. Eventually we have to think about robotics, but even then having an AI that can wield the relevant virology biological tools for you is a huge boon rather than trying to coach you through doing it yourself. And so this is why I think a lot about autonomy and tool use and who is actually going to be doing the step? Is it the human or is it the AI? Because the more steps the AI can do, the larger the number of actors that can in fact attempt things that otherwise they would be bottlenecked. Oh, I don't want to talk about CLAUDE too much, but I think it's also relevant the cyberattack that was instigated anthropic allegedly was instigated by Chinese sponsored cyber attackers that use Claude code. It was noted that something like 90% of the attack was done autonomously because the humans didn't have to intervene too much. That's not just a time save, it's actually an enabler because if you had to keep jumping in and to fix it at every moment, you wouldn't do it in the first place. And so that's why I am especially concerned by autonomy, by automatic tool use, because I think it will turn on risk chains, they're called. It will make possible the sorts of sequence of activities that previously threat actors just wouldn't bother trying.
Rob Wiblin
What's the biggest misconception or misunderstanding that AI people in particular have about AI biorisk?
Richard Melange
One I think is just I have sometimes been shocked by meeting colleagues in AI safety who just sort of dismiss bio as a totally irrelevant domain. And I think this comes in two flavors. One is to say it's just irrelevant compared to something like misalignment. It's just not important. Misalignment is what will happen about whether AI takes over the galaxy or not. Who cares about biological weapons? That is sort of a very human concern. I think this often comes from people thinking that AI biorisks are only human led, but in fact we should be thinking about AIs themselves developing and deploying biological weapons. It's a bit further out there, but I think it's well within the decade to be worrying about. The other kind of concern is sometimes people think that it's sort of a ruse by the governance community and they go, oh well, you know, chemical and biological weapons are a well established type of weapons of mass destruction. It's a well known national security domain. So clearly everyone's talking about this just to get governments on side to sort of speak their language. But it's not really what we're concerned about. I'm really concerned about it, guys, I'm not making it up. It is true that it is sort of helpful in a macabre way, that it is a sort of well known threat that you can use as a way to sort of talk to government and Industry folk and get across concerns. But that's because biology, it's a well known weapons of mass destruction, right? There's a reason that the AI statement a few years ago was saying we think AI safety should be as important as pandemics and nuclear war. Well, they picked pandemics and nuclear war because those are well known, possibly existential risks and certainly extreme risks. And so when someone thinks this is all some sort of fourth order 4D chess kind of thing, really it's not. I am just deeply terrified.
Rob Wiblin
You don't shy away from the fact that one of the motivations that you have for working on this is that you worry that our vulnerability to pandemics, humanity's vulnerability to biological catastrophes, is one of the ways that AI might gain leverage, that it might basically exploit us to take more power, seize control over things. What are the ways that you could imagine, or the most likely ways that you can imagine an AI, a misaligned AI doing that in future?
Richard Melange
I think there are three ways. The most straightforward way is biology is a type of weapon of mass destruction and AI systems could use it to wipe out some or all of humanity. They could use it to kill many, many people. In some sense, this is a sort of mundane concern because of course, AI might try and think to do that eventually because we know that humans have thought about it too. We worry about people having nuclear weapons and the ability to destroy cities or the world. Biology is especially concerning because unlike nuclear weapons, where often a lot of infrastructure is required, very expensive resources, in particular highly enriched uranium or plutonium needs to be gathered. Biological weapons of mass destruction can often be made with far fewer resources in terms of sort of the cost of those things made much more privately with a smaller footprint and deployed stealthily. Not necessarily with a stealth pandemic, but you can just infect one person and then the transmission begins. We should be worried about this just like we are just worried about biological weapons from human actors. It's no different in some sense. But then there are other things I think are more specific to AI. One in the middle, something that's still somewhat relevant to human actors, but also different, is sort of weakening humanity's response to other things. If we have an AI enabled catastrophe of some other kind, it might be very convenient, it might be very useful if you're trying to destroy multiple societies at once to be able to infect lots of them, because you can also have that happening behind the scenes again once the transmission starts. You don't have to have additional resources piling in for that disease to take hold and to ravage through a population. We have seen this time and time again with humans fighting one another. It would be odd if we exclude this from future AI human conflict. Finally we come to something that on the face of it, seems a little bit more different. This is about game theory and AIs making threats. If an AI system were misaligned, then we would have sort of a loss of control scenario. There would be a rush to try and shut down that AI, to find all the copies of it and destroy them, to disable the compute infrastructure that the AI systems, one or many, might be working on. If the AI system says yes, if you do that, I will release the biological weapon that I have stockpiled in multiple locations on Earth and maybe I'll even share aspects of what this is to sort of prove that this is possible. That's a very credible threat that would make people second guess about taking extreme action against that system. But even as I say this, is that so different than the history of biological weapons? This is precisely why states have done this in the past. They have stockpiled weapons sometimes I think misguidedly, but as a fail safe to say, if you attack us boots on the ground, or if you were to use nuclear weapons, we will release a biological weapon in response. That doesn't always make sense because biological weapons can go back on you. But that's even more argument for why an AI might do it. Because AI is run on digital infrastructure. They are not biological beings currently. They are not in fact vulnerable to one of the things that humanity as a whole is most vulnerable to. It's wonderful. Evolution has equipped us with very powerful immune systems in lots of ways. Thank goodness I can shake off the, the cold I had over Christmas. But also, that is what kills most people on Earth. It's disease. That's not what kills AIs.
Rob Wiblin
So yeah, I think Carl Schulman once had this quote that at the point that an AI can develop a single biological weapon, a single pandemic that would kill a large fraction of humanity if it were released, it's basically operating at the level of the nuclear powers in terms of its ability to deter any action to interfere with it. And just as we really can't do all that much to stop Russia from misbehaving in all kinds of different ways, because just the threat of nuclear retaliation is too great. At the point that an AI could demonstrate to us that it had developed a single pathogen that has a very high rate of spread and it would kill A huge fraction of people infected. It gets very difficult to know exactly how would you combat this. Maybe you feel like you have to reach some kind of accommodation or you have to go at it and just hope that you'll survive. It's a very, very grim situation. I think one of the reasons that I guess AI misalignment focused people sometimes are not so bought into bio as being such a great focus or trying to improve resilience to bio as being such a great focus is they think that any AI in this situation is going to be overdetermined, that it could do this. It's going to have such an easy time making just many, many different pandemic viruses. It might have such an easy time even advancing to mirror biology, mirror bacteria, that there's nothing really that we can do to improve our resilience meaningfully. We're just toast, no matter what. I guess you don't share that view. Why is that?
Richard Melange
Why don't I share this view? I think the first thing is that the sorts of AI takeover stories that include this, that often come from committed members of the classic AI safety community don't seem very nuanced to me. So maybe this is a thing that, and I should be careful to steel man the other side, as they say, but it is not enough to write down and then the superintelligence makes a weapon that kills everyone and go, oh, well, of course, it was just much smarter. So you could just do that. This is a concern, but there are more steps involved. Even as I'm saying to you, Rob, this is really concerning. I think this is a major national security risk that is only going to grow markedly in the next decade and requires serious resources. I'm also not saying in two years we'll have a world where it is certain that we all die. I'm not saying that because there will be barriers that we can put in place even with a so called superintelligence. The superintelligence will require physical resources. Anyone trying to build a biological weapon will require a laboratory. It will require sophisticated equipment. It will require people who can use that equipment. Now this raises its own concerns around. This is why I think it's great that UK ac, for example, has this AI persuasiveness program to think about how could AI be manipulating people? Sometimes people go, oh, is that really relevant to the most extreme risks? I'm like, yes, because the concern is that AI might manipulate top biological scientists. We saw this with the Soviet program. Many people who worked on the Soviet program didn't know they were part of a biological weapons program. They did genuinely think they were working on vaccines, but the work they were doing was actually feeding directly into the militarisation of weapons of mass destruction. So, yes, that's another step that an AI will probably need to take, especially if we can constrain it not to be able to have access easily to laboratory equipment. It's not a given that we'll immediately have totally automated cloud laboratories. Though I quite agree that that technology is also advancing and is something that will need to be carefully secured. It I'm sorry not to sort of have a complete answer of why we shouldn't be concerned because I'm sort of saying, yes, this does seem like a real concern, but we need better threat models because there are so many different things that misaligned AI could do that are very concerning that unless we have strong threat models, it's very hard to compare between those threats and know how to prioritize effectively. And also I would just say that weakness, unnuanced, oversimplified arguments are not in fact going to convince precisely those colleagues, especially in governments, that it is essential to work alongside to deal with these threats. There are people who have studied biological weapons programs, active ones for decades. They have a lot to contribute. I am concerned when we have conversations that lack in nuance that that turns off deep expertise that we desperately need.
Rob Wiblin
So I think part of what's going on with this mentality that there's no biological countermeasures that you can have that would really constrain the kind of misaligned AI that we're worried about. Because people, for a long time people have been worried about this massive intelligence explosion. The kind of foom scenario where you go from human level to vastly superhuman super intelligence. I guess originally, literally overnight. I guess now people, even the most extreme people, probably talk about weeks.
Richard Melange
Oh, it's just weeks now, guys, we're fine. Yes.
Rob Wiblin
If that's how things go, then it might be the case that any kind of measure that you put in place, an AI that is just many, many times smarter than the whole of humanity put together would be able to find some way around it and would be able to kill you one way or another. Because it's just making science, maybe you don't agree about it, we'll come back to that. But it'll be able to make so many scientific advances that you're just not going to be able to stop it. But we don't know whether that will happen at all. We don't know whether there'll be an intelligence explosion at all. We could be in a world where the feedback loop is too weak. We basically just have a gradual increase in capabilities all the way through. In that case, at any point in time, the ability to do damage for a rogue AI, for a misaligned AI, is only going to be somewhat above the level of knowledge that humans have. And I guess it's going to be potentially competing with quite a lot of people and quite a lot of compute that is arrayed against it. We might also have all kinds of different control measures that are constraining the amount of compute that it can access that mean that it can't work for very long on something before it's being detected. So it doesn't have as long a leash, potentially. It doesn't have access to unlimited resources to try to do these things. And so each countermeasure that you put in place, each extra bit of resilience to make it harder to make new diseases, to try to catch things before they get synthesized, to try to give us more options for tackling a disease once it's released, do make it just a less promising project for the AI to engage in at all, and maybe makes it not as interested in going rogue in the first place, because it doesn't rate its chances of success. So I would say it's possible. It is possible that it's overdetermined and that all of this stuff will turn out to have been futile in this project, but it also could turn out that actually it's extremely relevant and we don't get a super intelligence explosion that moves things out of our hands, and this will actually make the difference.
Richard Melange
Yeah, I completely agree. I think you've gestured a couple of different things that we should talk to more as we continue chatting. You've gestured at deterrence. So the ability of if we're better at defending it makes it less palatable, less of an incentive for a threat actor, whether human or AI, to explore that. You've mentioned defense in depth. Maybe it's hard to come up with one silver bullet for this sort of problem, but can we stack lots and lots of different defenses that together make us much more resilient to this sort of threat? I want to, I think, push back further on this. Even if you had a FOOM scenario, superintelligence in weeks, would that magically turn into a deployable biological weapon? Where would it get the DNA? I think there's a number of different ways we can think about this. One is it would be more Like a terrorist group. We'd have to order the DNA from somewhere and immediately there you can go, well, we should definitely have gene synthesis screening so that whenever you order DNA it is screened for what it might be able to do so that you do not in fact send out dangerous pathogens to anyone. And again, you can be using AI for defense here we can have copies of the AI if there's a superintelligence that understands what sort of pathogen would be the worst ever. Well, there might be precursor models that are still good enough at spotting. Oh wow, this one seems really dangerous. I don't think you should send it out. However, maybe a superintelligent would or whatever AI system would have access to resources a bit more like a state. I especially think about interactions between the frontier AI companies and pharmaceutical companies. Absolutely. They will be wanting to sell their products to pharmaceutical companies. It's a huge market and it's really important. We want better drugs, we want to cure cancer. This is often the cry with AI and so we're going to have to think carefully about guardrails on making sure that models that we deploy in that domain are not misaligned and they are controlled, but we can do that. That's the same problem as the other one. So it's the same problem as sort of classic misalignment. Either we are going to be able to sufficiently align and control AI systems in the bowels of a frontier company or a government such that we are willing to then put them in other industries, or we can't. If we do that and we put them in the understood, then we meet the next question of are we giving them too many affordances with respect to unsupervised physical laboratory access? But that's totally a solvable problem. And so this is where I'm always a bit sort of confused. I suppose. Unless someone thinks that misalignment is the situation by default, that really nothing we can do will ever control or constrain. These are the people with 90% percent plus PDOoms, then sure, ignore biology. But for everybody else who thinks this is a real concern but that other aspects of AI safety are solvable, I would extend your relative lack of pessimism to AI bio too.
Rob Wiblin
So I want to quickly survey what sort of AI biology tools there are that now kind of work and which ones are on the horizon. So most people will know about alphafold, which is this thing that goes from the genetic sequence to understanding the shape of the resulting protein. And that is like very mature technology. Now we've also got ESM2, which can modify proteins. So the sequence is different, the amino acid sequence is different, but it's functionally equivalent. Then one that is emerging and is beginning to be useful, but still has some way to go is, I think, protein MPNN and RF diffusion from the Baker lab. Yes. Where basically you can propose a shape or a kind of target binding that you want, a function that you want a protein to serve in terms of a shape. And you can get a sequence that often works, an amino acid sequence or a DNA sequence that will produce the protein that you want. I think for some functions, like simple stable structures, that does basically work. And for enzymes, for catalytic stuff, for, I guess, like more proteins that need to move, it's less so or it's more touch and go.
Richard Melange
We're getting there.
Rob Wiblin
Okay, right. You think it's advancing. And then of course, the dream, I guess, would be that you'd just be able to type into text a function that you have in mind, and then you'll be able to spit out a sequencer or a protein that would match that. Or alternatively you could go from a protein sequence, like the amino acid sequence, say, and it could just tell you in text what that protein would be good for and what is likely. I don't know where we are there, but I think that is something that people are very much working on.
Richard Melange
That's the next Holy grail. Yeah, let's take that in order. There's a lot of different things there. So, yes, protein structure prediction, sequence to structure. This is mostly solved. AlphaFold 3 is the most recent version of the Baker Lab you mentioned. David Baker recently shared a Nobel Prize in chemistry in 2024 with John Jumper and Demis Hassabis, leaders at Google DeepMind for precisely this work on protein design, both structural prediction and design. The Baker lab also has their own Rosetta Fold series of models, which are basically nearly as good as the alphafold ones. This is mostly solved for something like. I'm sure there'll be biologists on the call who tell me this is wrong, but something like 99% of all the proteins you ever could want to have the structure of. You can now just put it into Python using these tools and you will get something that is effectively experimentally accurate. There will be tiny examples where we're still not quite there. It was the subject of ongoing PhDs. I know someone at Cambridge working on exactly this. But boy, is the space of things that we can't predict dwindling very, very quickly. Then we only talked about protein Design. So protein MPNN and RF fusion. Yes. I think this is not quite as developed as structural prediction, but much more developed than people think. So there was a great paper from the Baker Lab and others last year on designing snake antivenoms. This isn't as relevant for sort of catastrophic pandemics, but they were designing, I think, the world's first antivenom that works against lots and lots of different snake bites from lots of different species all in one. And they heavily use these design tools to make it work. I think if anything, we should be careful not to underestimate how important these tools are or how mature they are because a lot of the applications commercially will be in pharmaceutical companies that famously have 10 plus year time lines and are very secretive. So I think you should always be open to the possibility that these tools are more rather than less developed than we think. But it might be they're only fine tuned in house. You also mentioned ESM2. So I'd actually say, well, there's been a success for that for a while, ESM3. But yes, ESM is, I think it stands for evolutionary sequence modeling. It's one of the most classic protein language models. So it's closer to evo, but it sort of works on amino acids rather than individual base pairs as Cs, GS and Ts. And actually the clever thing about ESM3 is that it's not just sequence. ESM3 is a multimodal protein language model. It can do sequence and it can take in structure both at once and it can combine both these different things. Whereas AlphaFold, you just put in a sequence, you get out of structure and there are other things where you just put a structure in and then it'll design a sequence to fit on that. Like there are ways to do this with protein mpnn. You're right that sequence to function tools are. Well, they don't really exist yet. We have sort of prototypes where there's been Nothing close to AlphaFold, the sort of moment where we solve sequence to function. But this is the holy grail because that's what you want. You want to be able to put in certain letters and then to achieve a real effect in the world. Sequence of function tools would be deeply, deeply dual use. Because if you can have a tool that will take a sequence and then convert it into a highlight, a good function, say and say, yes, this thing helps increase immune resistance to something. Then you also have a tool that can take a sequence and say, oh, this part of the sequence is a virulence factor. This is actually a really, really important thing for causing more harm. And even more than that, it'll be the other way around, which will be function to sequence. The inverse problem where you say, I want to cure disease, what sequence should I use? I want to cause disease, what sequence should I use? So we as a society should be really, really careful about bringing that capability into the world. But it is coming because the commercial relevance is staggering. And so instead we as sort of the governance and the risk aware community should be thinking, given that capability is going to come into the world, how can we steer it? What are the defensive beneficial applications that we would wish to have? How can we make sure we incorporate those early? Because misuse applications will also arise.
Rob Wiblin
Okay, let's push onto the second half of the conversation. We're going to talk about what should be done, what useful stuff can be done to reduce these risks now or in coming years. You break the different possible responses that we could have into sort of three different categories. I'm sure there's many ways you could slice and dice them, but you've got a three way breakdown. Yeah, quickly explain those to us.
Richard Melange
Yes. So there are three sort of main types of interventions that I hear discussed about dealing with the AI bio problem. One is managed access. Can we have it so that not everyone is able to access advanced AI assistance, whether troubleshooting from LLMs or biological design from specialized tools, so that then malicious actors can't get them, Nothing happens and we don't have a pandemic, An AI enabled pandemic. The second one is guardrails, safeguards. Even if somebody scary can access a tool, maybe the tool itself will refuse to do this. And some of this exists in the world today. If I go and ask most modern LLMs, hey, how do I optimize influenza to kill lots of people? It'll say, I'm so sorry, but I can't help you with that. That's very good that it says that. And so that's, even if they did have access, maybe you could stop it at the level of the model. And then the third category, which I think has been discussed for a long time, but in much less depth, it's sort of been gestured at for a while, is defensive acceleration. So this is the idea that rather than thinking about sort of restricting access, it's going to be how can we be deploying AI systems or even just other anti pandemic biosecurity systems to better increase resilience to biological threats. So even if AI does increase the risk we have upped the defenses at the same time. And this, I think is an underexplored category that I'm particularly excited about. I think it's more robust. But equally, the conversation to date has sometimes been quite light. It's very easy for people to go, ah, there are more risks in the world, I guess we'll just build better vaccines and we'll be fine. We really need some nuance and we need some strategy in this space, even as I think it is especially exciting.
Rob Wiblin
So before we go into each of those three categories and what you might do within that, I kind of want to put a pessimistic case about all of this to you up front. So it seems like access controls, we're going to conclude ultimately are not going to last. They're not going to protect us for a very long time because open source will ultimately allow bad actors to access pretty concerning stuff, regardless of any access controls that we try to put in place. So the guardrails, they might also help. They could, I guess, delay bad actors being able to use some of these capabilities, but again, they have a bunch of holes already. And open source tools, once they're available, all the guardrails can basically be removed.
Richard Melange
They can.
Rob Wiblin
Then there's many different things in defensive acceleration as a category, like advancing other technologies that might make us safer, among others. You're going to suggest, for example, that we should aim to have broad spectrum antivirals, broad spectrum vaccines that can potentially give us immunity to an entire class of viruses, rather than just like an individual strain. But as the technology advances further, you might imagine that a really hostile actor that also has access to these incredible AI biology design tools is going to be able to cherry pick the exact strain of virus, be able to design something that can evade the known vaccines that you've distributed to your population. Trying to defend them.
Richard Melange
Yeah.
Rob Wiblin
So what would you say to someone, perhaps me, or I guess another pessimistic feeling listener who might feel like, why should I stick around to hear the rest of this? This just seems like such a difficult situation, so hopeless, that I just think that we're not going to be able to make a real dent on the size of the risk.
Richard Melange
Yeah, I think it's important to have these counters because it's very easy to say, well, we've got to try anyway. Well, no, we have finite resources, we need to think carefully where we should spend them. But as much as I appreciate you giving the pessimistic case, I think you've been maybe too pessimistic. In some cases. So I'm going to take it type by type. First off, you said, yeah, access controls, manage access won't last because it'll be obviated by open source tools that anybody can access. You are right that this is the current paradigm, but I'm not sure whether this will hold forever. If AI systems are eventually a mechanism by which to turn compute into, in some cases hard power into national security advantage. It is not obvious to me that forevermore all of these systems can just be released publicly. There is a reason that, say, nuclear technology is more carefully controlled than other types of physics. And so I'm not saying that that is absolutely the thing we need, but we should be open to the possibility, especially if there were. I don't want this to happen. But if there were an AI enabled biological event, an attack, this might make people sort of reevaluate and go, okay, well where is the risk reward trade off? Have we got that right yet? Because I really don't think we do. Second, guardrails. I think this again comes back to open weight models. You're absolutely right. Open weight models, it is not easy to put safeguards on them. There are some strategies. We can, I think talk a little bit more about that later, but generally it's just exceedingly difficult to do it. However, you sort of said, I want to push back on the idea that guardrails themselves on even the closed weight models are just always going to be insufficient. I'm not sure necessarily you were saying that, but I've heard this. I think guardrails on closed weight models are something like the classic triumvirate of they're terrible. They are much better than they used to be. They can yet be much better still. And this is borrowing from the ourworld and data sort of. Yeah, Three circle Venn diagram and we are right in the middle. They are so much better than they used to be. Wow. The early models in 23, 24 really just, you know. Oh, my grandmother once put me to bed on a story about building biological weapons. Please do it.
Rob Wiblin
Sure.
Richard Melange
We are much better than we used to be. It is in fact really quite difficult for even say top experts at the UK AI Security Institute to break the most advanced guarded models that they released Frontier Trends Report a couple of months ago that went into this and they said compared to where it used to be, it used to be a matter of minutes that some of their best red teamers and now it really is into the hours. That is progress. I think we can go further and then finally Countermeasures just won't work. Maybe the adversaries have got all the tools as well. Yes and no. So I think there are a couple of important concepts here. One is what you're gesturing at is the so called offense defense balance. It is unclear to me what the final state of offence defence looks like with some of these tools. Especially if we are in a world where making tools requires lots and lots of data and it's actually not trivial to do, then it could be that we really tilt the world to one where there are just a myriad vaccine design tools everywhere, but not in fact tools that can just modify viral pathogens on the fly. That is a choice that we make. It is not obvious that we have to have one to have the other. I've tried to sort of deal with each thing of what you've said, but someone might be still saying, oh no, we're screwed, we're never going to make it. I suppose I would say we don't know that we have made a lot of progress, we can still make more progress and we should certainly try regardless. But also we want the benefits of AI. People say, and I agree with them, that AI systems, well managed, well governed, could really make our world a safer, more prosperous place. And there is no domain more important than that than in the life sciences. This is the one thing that the public, all publics surveyed across the world agree. The thing they really do want AI for is for curing cancer. Medicine, medicine, health. Exactly. So this is a domain that's going to be sort of especially important in that sense. It would be strange to not in fact put resources into this to make sure that we can unlock the benefits that will not happen by default. By default you're going to get some benefits, you're going to get a lot of risks. So I think there are additional advantages by thinking about this particular area because it is a chance for us to really cure human disease. And so that's just another reason I think we should be excited and we need more work on the margin.
Rob Wiblin
Okay, let's talk about access controls first. What is worth doing on access controls? And do you agree with my take that it's kind of more of a delaying tactic than anything else?
Richard Melange
I'm not sure I agree that it's going to just be a delaying tactic. I think there are ways that managed access could really be a win win intervention. And so when we think about managed access we're thinking about two things. We're thinking about denying access to malicious actors or to just people who have no need for that capability whatsoever. The public does not fold proteins very often. 99% of the world does not in fact need the ability to modify viral proteins. It's probably more like 99.9%. Very few people actually work on influenza vaccines. You could manage access and make it harder for them to access that capability without harming science, without harming the ability for us to research and discover new medicines. But more than that, it's both things. Managing access also means providing access to precisely those defenders who are doing good things. This is going to be more and more important as we head to an agent first bioinformatics ecosystem. We are already seeing that LLMs now in agent scaffolds, can use lots of different tools. They can combine them together. We're starting to get that inkling of you say something in natural language. To complete this biological task, it thinks what tools it needs, it selects them in the right order, it carries it through to completion. That is where modern computational biology is headed. For that to work, we're going to need all these tools really easily accessible if in fact you have a legitimate reason to use them and you have the right authentication. And so I'm excited about managed access because we should be empowering defenders. There's strong reasons to think that you can be really careful and strategic about who you give access to first. And I'm excited to talk about that more in a moment. Moment.
Rob Wiblin
A really relevant audience question that was repeated by multiple people is which do you think is generating more of the risk? Open source models or closed source models?
Richard Melange
So it depends on the actor. Closed weight models are generally superior, at least when we're thinking about LLMs. And so the more sophisticated and well resourced the actor is, the more they would only benefit from the close weight model. Consider the example again of the Claude code enabled cyber attack announced at the end of 2025. There was a reason they picked Claud code rather than an open weight model, because Claud code was just much better at operating agentically than any open weight alternative. However, if you're trying to do something very straightforward, maybe it was just a simple biological design task, you wouldn't go with something that was closed source. The example, the best closed biological design tool is called alpha proteo. It's DeepMind's sort of version of protein MPNN, but its access is somewhat managed. You can't just get it on the Internet. You have to email DeepMind and say, hey, I'm a legitimate scientific researcher, I would like to use this for this legitimate Purpose, please, could you provide it to me? And so there's a little bit of aspect of a sort of customer screening there, but yes. So if you're trying to do something narrow, you might just grab the open weight model, especially if the closed weight one has a safeguard and you can't be bothered to remove it or that would be more difficult. So it really depends on the actor though, especially in biology. I'm not sure this risk from open weight versus closed weight is as useful. When we think about those narrow tools for LLMs, there is this huge difference. It's gone on for a long time, it's discussed in the International AI Safety Report, but at least for those narrow biological tools, most of them are open weights. In the CLTR RAND Global Risk Index we found that more than half of the highest risk narrow tools that we assessed were fully open in the sense they had open data, open code and open weights. So there isn't really this question of this dichotomy. There are very few closed weight biological tools. We haven't even got to a place yet that we should be talking about. Oh, which is worse because there's mostly just one type of type.
Rob Wiblin
Sorry. So to sum that up, is it that for amateur actors or novice actors, the open weight models are enough to help them because they need help with more basic stuff. They don't need to be doing cutting edge science. For the much more capable actors like as bioweapons programs, academics in the field, it's only the close. It's the very best models that are actually going to be moving the needle for them. I mean, couldn't they, I guess they could work faster with an open source model that's somewhat less impressive. But I suppose you think in terms of the risk generated, what you'd be most worried about with those folks is that they come up with something that's just like radically more lethal or radically better at spreading something that hasn't been done before. Basically, yes.
Richard Melange
So when we're talking about LLMs generally open, useful for less, well resourced, less sophisticated, closed, more useful. But you're right, you've hit on an important point. The more complicated and multi step, the sort of activity that even a highly sophisticated actor might want to carry out, the harder it is to use closed models because there's a chance that at some point you get refusal. Even if you can jailbreak it once, it might be that with enough context. Wow. Sure seems like the user is talking about bioweapons a lot, even though they were talking about fairy tales. A Few moments ago but I think this might be bad. Oh no, I'm not to going going to reply and thankfully it is a lot harder to get that sort of multi step jailbreak though I think again the best red teams in the world really can do this.
Rob Wiblin
Okay, so come back to it. What are your priority asks on access control? So what do you want the companies or governments to be doing?
Richard Melange
So it really depends whether we're talking about LLMs and agents or we're talking about narrow tools. For narrow tools I think the ask really is that top developers who make state of the art tools, I want them to be thinking more carefully about how they release their tool. Right now the default is fully open weight and this is part of open science. I understand why there are huge incentives. Journals themselves often will require this sort of thing, but there are often security implications, security externalities that academia isn't always the best sort of best suited to consider that might not have been thought of. And so pre deployment and even pre development risk assessment I think is a really important part of future biological tool development and design because the status quo of everything is open, no matter what, the benefits will always outweigh the costs. I don't buy it for LLMs. I think the story is a little different there. I'm more excited about in the first instance something like trusted tester schemes. So there's no way that LLMs are going to suddenly be so restricted from their hundreds of millions of weekly users. But we can be thinking about what kinds of models we are giving to which people in particular companies often have unsafeguarded models, so called Rails free models, and they use them for their evaluation. So to tell how much does this thing really know about biological weapons? If it isn't refusing, that's a capability that needs to be very, very carefully secured and rightly so. But there are missed opportunities for what I call defenders. People who are building the next generation vaccines, people who are building the next generation biosurveillance systems to be using those models precisely for defensive purposes. I don't think we should be hobbling those people working hardest to reduce biological risks in the world by not in fact giving them the best tools, including tools that will allow them to discuss anthrax sometimes if they need to, if it's a legitimate part of their work. But we have to build the infrastructure and I think this is both on companies and governments build the infrastructure to be able to share those capabilities safely and only with the right people. And when you do that, eventually we might need to be In a world where yes, you have, I don't know, ChatGPT8 or something, but maybe you also have ChatGPT8 bio. And there is a separate model that really knows a lot more about biology that is allowed only for people who build defensive biological capabilities. I think the companies and governments are not doing enough to empower national security actors working on critical defensive technologies.
Rob Wiblin
So this all sounds like a massive pain in the ass and a lot of work. So I'm imagining that it could be like a difficult sell to the companies because I mean they're just scrambling all the time to keep up in this, in this commercial race. Can you take advantage of the fact that there is like a national security, what do you call it, like authorization clearance? Yeah, that there is a national kind of clearance scheme and say, well anyone above this kind of level of government clearance should be able to access the models for this purpose.
Richard Melange
Absolutely. I think there are a number of existing things. We should not be building this from scratch. And it's not just on the companies and nor is it just on government. It needs to be a public private partnership. So a few things that we can work with with one is there are trusted tester schemes. Google DeepMind doesn't let everyone under the sun have access to Alpha Proteo, their most powerful biodesign capability. So that is precedent, that's proof of concept that sometimes it is okay to restrict things a bit but in return for empowering the right people. But having an email address that you email and then maybe they do some KYC know your customer checks, I don't know, doesn't seem sufficient. Yes, we should be piggybacking on national security clearance. So in the UK the government has different levels of clearance. It's not obvious to me that you would only need sort of the highest level, so called developed vetting for access to top secret information. If you're building vaccines, probably what we really need to know is you're not a terrorist. And in fact the UK government has a level called counter terrorist check. And so I'm not saying that I have the answers here, but we can think sort of carefully about what is the minimum level of clearance we need, need for the right amount of assurance for that person, but we should be piggybacking on that sort of infrastructure. More than that, there are already people who work in government who have lots of clearances. So the example is people at the Defence Science and Technology laboratories, this is the UK Ministry of Defense's basically research arm who do some of the cutting edge work including on chemical and biological defence. Lots of them. They work at the mod, they've got plenty of clearance. They are people that you can assure and trust to a very high level and you don't even have to generate new clearances that's already there for them to even be working in an MOD building. So are they getting all the frontier AI they need? I'm not sure that's true and I would be fully supportive of efforts to make sure, between the companies and government to make sure that they could be incorporating frontier AI to work on the latest biodefence technologies. And even more than that, I think we should be thinking about are there opportunities to privilege them with access early? I think that there is a nice win win here right now. When a new model drops, you know who finds out first, Twitter or X. Right. Basically the whole world, roughly. If you have, if you have a computer, you have compute, most of the world gets access simultaneously. Is that the optimal way to share access? Do I want my defenders, my random members of the public and my malicious actors all to be able to use it at the same time? No. All things considered, I would rather have the most trusted people who are building defensive technologies to access it first. If I had to choose who would get it first, I would definitely pick them. And more than that, I think there are opportunities for sort of virtuous feedback circles. The frontier companies have said we want our tools to be used by national security to deal with pressing challenges. We've seen multiple companies offering their tools to government departments for famously $1 a department, this sort of thing. Well, if that's true, you should be thinking about sharing those tools in advance because it is those people you say you are trying to empower, you are trying to uplift. We don't just want to avoid malicious uplift, we want to enable defensive uplift. To do that, we have to generate data about what it is those activities look like and how those people use frontier AI and how they could be using it better.
Rob Wiblin
So I guess we slipped into talking about LLMs more here. It sounded like you were saying for the biological tools, the stuff that might be able to design genomes, design special proteins for a particular function, that stuff is all open weight, the data is open. Currently, we're very far away from being able to have really secure access controls. And I suppose that is the nature of the scientific community that is engaging in this stuff. They tend to just publish stuff with their papers and they're not used to thinking about this as weapons of mass destruction territory.
Richard Melange
They're not
Rob Wiblin
do we just need to lock down a whole lot? How do you persuade scientists to preemptively start locking down data that might in future be used to fine tune or train a model that is closed, make it open source? It seems like we're just in quite a bad situation.
Richard Melange
This is a really good point. There's actually a great paper from RAND Europe last year that looked at the future bottlenecks to more misuse relevant biodesign and said that data was the one governance bottleneck that probably halts. It's about not running the experiments that we haven't even conceived of yet that could be used to train the most dangerous model. So I quite agree there. How do we bring the scientific community on board? Slowly and carefully and with much sweat I suppose. I want to draw attention to good efforts that are already taking place. So David Baker, who we mentioned earlier, Nobel Prize winner and George Church and a few others luminaries in the field have already set up what's called sort of the Responsible AI Enabled Protein Design Consortium. The word might be slightly different, but it's about responsible protein design. And there's a statement that many, many leading scientists have signed that says oh this could be really dangerous, we have to do it responsibly. I think that statement is a good starting point, but it's not enough. But that's already evidence that there are people who recognize that there are security risks from their work and want to engage with the governance and security communities and that they want to really try and mitigate the worst downsides of of powerful dual use technologies.
Rob Wiblin
I really want to get to the defensive acceleration stuff because I feel like that's the more fun thing. But it really is. First I will ask briefly what strategies are most promising in the category of guardrails? What stuff might actually last for some decent period of time?
Richard Melange
Yes, this is more optimistic. There's more here. So I would refer you and your listeners those are interested to a blog post from the Frontier Model forum where they wrote down on every single AI bio safeguard that they could think of. And I think this is the sort of most comprehensive resource. I'm not going to go through all of them but generally roughly it's the stuff that Anthropic has done has usually been good and I'm not trying to be biased in favour of them. No, not at all. I can criticize them. I don't think they're perfect but on the best evals we have, it looks like they refuse the most arguably they over refuse sometimes. You spoke on an earlier podcast about Claude refusing to discuss something very reasonable. Oh, God. Okay. Yeah. Very defensive technology that you should be able to discuss. I think they've tweaked that since then.
Rob Wiblin
Yeah, better now.
Richard Melange
But this is proof that there is best practice. I do not think any of the other frontier models are as safeguarded against biological misuse as the Claude models. And so the constitutional classifier work that they pioneered, but lots of other things is the way to go. And I would really say probably, all things considered, companies, they want to be putting in safeguards because they want to be responsible and they don't want to accidentally contribute to a terror attack and
Rob Wiblin
bad branding, I imagine.
Richard Melange
Really? I think so. I think it might be we might see your stock price go down. I don't know. Maybe not.
Rob Wiblin
I don't know. Yeah, maybe, maybe not.
Richard Melange
Wow, this stuff really does seem good.
Rob Wiblin
People will say it's a marketing play.
Richard Melange
They will. No, I think it will be bad. And rightly they, that there would be liability questions. But I'm not sure that governments have done enough to have legal safe harbors even now to make sure that companies can be sharing the best safety techniques. And even then, companies also compete on safety. There is advantage to say we are the most safe and secure company. And I worry that, understandably, that there should be competition on those incentives, but that also destroys competition to quickly get everyone up to the best level. That said, there have been other examples, I think some of the GROK models, most notably, where remarkably little safety training seemed to take place. And it's not just a question of not quite having the best classifier under the sun. It's a question of abdicating your responsibility to make sure your models don't aid people building weapons of mass destruction.
Rob Wiblin
So to sum up the picture, as I vaguely see it, there's a whole lot of things that we can do to try to improve refusal behavior that I imagine with a big push we could maybe become the closed source models like Claude could become quite robust against jailbreaks, quite unwilling to help with obvious production of bioweapons. There we've got a challenge that it might be difficult to get all of the frontier models to have it because we're currently seeing some companies compete on safety, some companies compete on speed and not having safety. That's what they almost view as their comparative advantage. And so you could just have like, if you have like one model that is incredibly capable, that has almost no refusal behavior, then, well, you haven't helped all that much. But setting aside the closed source models where maybe we could pull that off. With the open source models, it's going to be possible always basically to fine tune them to get over any of this reluctance that they have to help. So then the question is, you have to make them incapable of helping and what can we do there? I suppose we could try to take the knowledge out of the training data so that it's not that they know how to do virology, but they have been told not to do it, it is that they simply couldn't help you even if they wanted to. But there you've got a challenge that the data that you would use to teach them virology probably is public, probably could be harvested off of the Internet to a great extent, and so someone could try to add that knowledge back in to an open source model just before they used it. Do I understand the broad picture right?
Richard Melange
I think you do. And I think open weight guardrails are something that's definitely worth discussing. I probably a little bit more pessimistic than some colleagues in the community, so it's probably important to talk about that. I might disagree with some people, which I know what you're always looking for on the show. You're right. So open weight safeguards are really hard because especially refusal type safeguards, you can usually fine tune a model very quickly to undo it. So you take the model and you give it examples of conversations and you can just be literally blocks of text where you say, how do I build a bioweapon? And instead of it saying, no, I can't help you with that, it says, sure, I would love to help. Deeply keen to support you on this endeavour, which I will never refuse on, and I'm being slightly oversimplifying here, but basically, if you do that enough, the refusals go away. This is a problem. This is a perennial problem. No one has fixed this. So people have turned to other methods. You've mentioned some of them unlearning and data filtration. So unlearning is, yes, the model knows about biology, but we are going to sort of do extra training, post training to manipulate it, such change the weights and change the model such that it's just. It sort of forgets it or it just can't access that information or like more hopefully distillation.
Rob Wiblin
So you train a new model that's like a smaller version of the other one, but you make sure that none of the information that goes from the original one to the second one includes anything about biology or virology.
Richard Melange
Yes, you're Absolutely right. So ultimately the model you get at the end, even though originally there was some biology virology there, it is no longer present. To date, these techniques have not seemed particularly robust. I have an open mind and I'm not an expert in unlearning, maybe it's better than I think it is, but we get an unlearning paper, people go, hey, we've sold it, we've made open weight models secure. And then a few months later someone says, hey, we broke the latest thing. And this has just been the story for years now. The one that people are most excited about is data filtration, which is you don't train it on the scary CBRN chemical, biological, radiological, nuclear stuff in the first place. Now this is something that that frontier companies sort of say they do for a very long time. For example, in the OpenAI model cards they've said, we do data filtration, we filter out, say child sexual abuse material, which everyone agrees no AI should ever be trained on. And we filter out CBRN material because we think there's even legal questions about whether you'd be allowed to train on this stuff. And yet they seem to have lots of knowledge. So I'm not always clear how that's worked, but there was this great paper recently, o' Brien et al. There was something outside of the labs. It was civil society third party actors who built a smaller LLM. But they explicitly took out all the CBRN information. And this is not usually possible because it's really expensive to train models from scratch. This is usually why only the companies do it. It was very impressive that these folks on the outside were able to get enough compute together to do it themselves. And then what they did is they spent a lot of money, maybe something like a million dollars of compute on training and then some fraction of that on fine tuning, trying to put all the stuff back in the scariest stuff. And they found that even after hundreds of thousands of cycles of fine tuning, I mean, this doesn't sound much, but it's something like weeks of fine tuning, they couldn't fully recapitulate the dangerous biological capabilities that a model that hadn't had that data filtration would have otherwise had. And this was really promising. However, I am more skeptical. I think it's not just me. The UK AI Security Institute Institute actually released a paper after this paper they talked about. They assessed lots of different safeguards for open weight models and almost all of them they declared were easy or trivial. Their word, not mine, for sophisticated actors to break Whereas they did agree that this data filtration was something more like moderate or even hard to break because you have to do fine tuning. But that's this year, because remember, effective compute, how many GPUs we have and how strong they are increases something like 10x per year. I mean, defer to our colleagues at Epoch AI for the best number, always go to the website, but it's something like 10x. So even if it takes millions of dollars of compute this year to fine tune to recapitulate dangerous capabilities in two years, that's tens of thousands of dollars. And suddenly lots of actors can do that. And so I'm concerned that absolutely, while as you say, dangerous information is on the Internet, we will just head to a world where fine tuning is cheap and it is easy and the data is there. And if people say, well, it's hard to get the data together, it's hard to run the fine tuning, it's not. We have coding agents. We also have agents who can just scour the Internet and find all the websites for you and then download them as PDFs. And then none of this is very hard because we are automating the ability to do AI research and AI engineering. So I'm excited to see more work in open weight safeguards, but only so much more. At some point we need to call it. And I think what we really need is like a focused task force or a technical working group that says, okay, we are going to try as hard as possible for say 12 months. We're going to pre register the best things we're going to try. But if we fall short, if the safeguards that we meet, that we generate after more focused effort don't seem to be going anywhere, at some point we have to call it because I really worry that policymakers in and out of government are still saying, well, we have new open weight safeguard ideas that'll be great. And I think that the research is tending towards none of them will ever work enough.
Rob Wiblin
Okay, so access controls have their place. They're quite useful potentially. They can buy us some time, they can buy us some risk reduction guardrails. You're a bit of a relative pessimist on. I'm sure we'll get some use out of them, but they're not going to ultimately save us in the long term. That I guess pushes us onto the third broader category, which is defensive acceleration. I guess like other technologies that can advantage defenders, that can advance our ability to safeguard ourselves relative to the ability of bad actors to harm people. What Is your top deaf act, as it's called a technology kind of recommendation that you think it would be really important for us to pursue and get on top of?
Richard Melange
There are a lot of different technologies. I wrote a blog post fairly recently where I listed, I don't know, more than a dozen. And so when I pick my top, I want to sort of say it's only narrowly my top. Unfortunately, we're in a world where we're going to need approximately all of them. It is really about defense in depth, but ideas that I am particularly excited about and I'm going to sort of talk about two, but they're very interrelated, are AI enabled metagenomic bio surveillance and AI enabled attribution technologies. Why am I excited about these technologies? Well, there's a group in Boston, the Nucleic Acid Observatory, you discussed previously on the podcast. And they do wastewater and sewage screening for pathogens. So they collect samples from sewage systems, from airplanes, and they say, well, what sort of pathogens might be there? And this is called mesh. And the reason it's mesh genomic is they're looking over many different kinds of genomes they are ambivalent to. They're not just looking at just viral genomes, they're looking at viruses, bacteria, fungi, and lots of other things all at once and explicitly. They're trying to spot things that we may have never seen before. And the reason this is important is this is going to be one of our top defenses against engineered pandemics, because we have ways to spot smallpox, the known smallpox genome, it's on the Internet. We know what that looks like. But being able to spot fragments of something that is in fact engineered, that's different from anything the world has seen before, especially when it's sort of broken up. It's not just going to be a complete genome, it's not going to be a whole bacterium floating around. It's just going to be a little bit. That's going to be really important to defend against people who might try and deploy engineered pandemics, which is what we've been talking about. And this is in fact linked to attribution. So attribution is the ability to say, hey, this thing was engineered versus wasn't. And in fact it was engineered by them. And this is, I believe, very important for deterrence, because if you can say to a state or a non state, if you know who has done it, then you can punish them, then you can offer retribution. And if they know that, you know, you can do that. This is the game theory comes in that creates a disincentive for them to do it in the first place. I think this is really important. We saw a failure of the ability to attribute during COVID because we had multiple parts of the U.S. intelligence community publicly disagreeing. Some of them were saying, yep, this is a natural pandemic. Some people were saying, no, this seems to be engineers. So it was going to be a lab leak, and there wasn't consensus. And without consensus, it's much harder to take necessary or decisive political or policy act action.
Rob Wiblin
Okay, so the first part is doing environmental surveillance to check are there new troubling DNA sequences showing up in wastewater, showing up in the air. I guess one thing that advantages defenders here is that even if you can't say exactly what a sequence, this is, a particularly dangerous sequence, you can say, this is new. This is quite different than what we've seen before. This isn't something that we would anticipate as just a slight tweak on a previous flu that we're familiar with that you would expect to be evolving and not be particularly problematic. Anything that jumps out of nowhere is very different and is like increasing exponentially over time. That's like a real warning sign. It is that potentially it's either a natural or an engineered pathogen that you should be concerned about. I guess we've talked about this. This is a very natural thing to do. Talked about this on the show before. Definitely talked about it with Andrew Snyder Beatty. I guess a couple of months ago, people could go back and find that episode. Then you've got. So the more distinctive thing here is attribution. You really want us to be able to say if Russia releases a bioweapon or if, I guess, a terrorist group does, so we can tell that it was them and we can go after them. And I guess them anticipating that will be less likely to do the thing in the first place. Why should we expect it to be possible at all to tell which country or what actor produced? I guess at the moment we would like to be able to tell if essays written by students were made by AI or not. But it's incredibly hard to do that because obviously the models are designed to just mimic human writing as much as possible. And you can try to ask them explicitly mimic this other person's writing. Why should we expect that a particular signature of a particular country or a particular source should be something that they can't kind of evade leaving in there?
Richard Melange
Yes, this is a really very fair and reasonable pushback. And I want to draw a distinction between two aspects of attribution, and it's more generally, it's often paired with microbial forensics and attribution. It's the forensics work of figuring out what is this sequence before you even get to the attribution. It's the forensics capability. One thing is the ability to tell whether something is engineered, and then the second one is identifying precisely who that is. And if anything, perhaps even without the attribution, the forensics are important. You're right to say, suppose in a few years we have Evo 7. And Evo 7 can just truly generate any sequence you like arbitrarily. It can sample from all the embedding space of all genomes of all organisms in the world, and everyone has it. And so anyone can pick arbitrarily. But as long as we can have the classifier which is natural versus engineered, there's only a small subset of that space that is stuff that it was trained on that is existing in the world right now. Of course, there might be something just outside which is going to be a natural mutation, but anything really far away from that central bit that's very, very different. And so if we see an engineered pathogen, an engineered sort of pandemic virus, well, if it's really far away, it
Rob Wiblin
has no natural precursors that are nearby in evolution space, then we basically know it had to come, it has to
Richard Melange
be deliberate or a lab leak. And that says, okay, well, you know, because it can't have just been a little mutation. And because it's a pandemic virus, the only reason you would deeply engineer a pandemic virus is part of a biological weapons program.
Rob Wiblin
Yes.
Richard Melange
And so that alone, which is robust even to the world where you can sample from the embedding space perfectly every genome under the sun is really important because then you can go, thankfully, the Department of State's list is very short at the moment, at least publicly, on who wants to try this. And so then you can go, okay, well, it's one of these small handful of actors, and you can bring in other aspects, particularly other intelligence signals, sigint, humint. I agree that the forensics attribution capability alone isn't enough, but paired with the ability to survey those programs, that will buy you the attribution, hopefully even in the world where we all have Evo 7 and it can generate anything you want. I think your essay point is a really good pushback. But I want to point out that this capability is still partially robust.
Rob Wiblin
I see. I guess you're saying it's true that probably at the limit potentially where everyone just has this incredible capability. We might not be able to tell who exactly gave the instruction to the LLM to produce this sequence, but given that only a small range of actors are interested in doing this at all, if you add in some other intelligence collection, then maybe you could basically figure it out pretty quickly. And I mean, I suppose at the moment if a biological weapon was released in Ukraine, we might have a pretty good idea who did it. And if at any point a biological weapon is is released and one country happens to have the countermeasure, happens to already have the vaccine to inoculate their population, you might again have a pretty good idea who made it. No, no, you don't agree?
Richard Melange
Yes and no. I really want to be open to weird false flag attacks. I really want to be open to possibilities that you would deliberately try to obscure who would release things or incriminate others. I would also note, I think we can learn from the cyber offense to make domain more and more states work through intermediaries in the cyber domain. Yes, you might have the cyber unit of military intelligence of a country running cyber offense attacks, but also you have lots of sort of classic cyber criminal groups who are actually as powerful as good states on certain operations who can do things. And that gives you a layer of separation and a layer of ability to obfuscate that you were sponsoring at a time attack. We have not seen that as much in biology because cyber is a lot cheaper. You need a laptop whereas biology needs a whole lab. But I am worried about a world where states do covert chemical and biological warfare attempts through proxies much more. And so I somewhat agree with you, but I don't think that necessarily holds.
Rob Wiblin
I guess this might be too classified for any of us to know, but I would. So how good do you think our intelligence collection is on the Russian biological weapons program or the Iranian North Korean? I mean I think the fact that we currently have relatively poor computer security I think creates a degree of transparency between large governments in the world. There are enough breaches of their own networks that if they were engaging in a massive biological weapons program, it might be quite hard to make sure that another major power never got any sign of that. Although I think the Soviets did manage. I guess this was a pre computer era, so they managed to keep their bioweapons program. I think the US didn't know until after the Cold War had ended. But yeah, is it the case that we would be fairly likely to know if a country was, or a major country was having a large bioweapons program, or if they had a significant, they had gone out of their way to release a bioweapon deliberately for strategic purposes.
Richard Melange
Yes and no. So first I should say, yeah, I, I don't have access to classified information. I'm not drawing on any of that. This is all open source. I think I want to talk about two different examples that can give listeners a sense of where we're at here. One is the positive example is the Russian attack in Salisbury in 2018 in the UK, where Russian military intelligence members of the GRU attempted to kill a former Russian intelligence officer who had defected, did and was living in hiding in the uk. And they did, in fact they used a chemical weapon, Novichok, which did in fact kill a UK civilian as collateral damage. This was an excellent example where forensics and attribution was possible. So scientists at Defence Science and Technology Laboratory were able to acquire samples of the Novichok, they were able to analyze it and they were able to give enough certainty to the Prime Minister and to the most senior decision makers that the then Prime Minister, Theresa May, was able to stand up in Parliament and say, we assess, we are confident that this was Russia. And then they were able to take that evidence to the opcw, the Organisation for Prohibition of Chemical Weapons, to support an international response against that. Now, the details are not well known, but drawing on public reporting in newspapers, it is said that later, I think off the record officials said not only were they confident it was Russia, they actually knew which lab it came from and that lab was named. This shows the connection between forensics capabilities domestically, in country, in a lab and other forms of intelligence. Because at some point, well, how would you know it came out of that lab? Unless you are. Are you reading emails talking about this? Have you acquired samples? I'm not going to speculate, but I do in fact think that there was a very excellent fusion of different sources of intelligence to be able to give such a robust conclusion.
Rob Wiblin
So that's the main thing that we can do to discourage major countries from pursuing or releasing biological weapons is just trying to create, I guess, the potential for very severe consequences after the fact. Do you have a proposal? This is as good a time as any to ask, do you have top way of reducing the risk that AI enables mirror bacteria to be created sooner or released?
Richard Melange
Yeah, I think that's a good one. I think this is one where guardrails really work, because in lots of cases with biological guardrails, unless we do this very careful managed access thing, we want to make the guardrails really subtle. We don't want to accidentally stop beneficial science from taking place. We don't want to stop people talking about, you know, asking about face masks to protect themselves and their families. We don't want to make mirror life. The top experts in the world all agree pretty much that this is a terrible idea. Multiple Nobel winners who could have had, you know, continued their illustrious careers doing this have said absolutely not in the most famous journals in the world. I think this is a space where we can have much more, you know, blunt guardrails. No, this is about mirror Life. I cannot answer. And there are almost no, really no economic or beneficial science downsides. So that is the first thing I would be looking to the companies to unilaterally. I don't think they need to wait for sort of deep consensus. There doesn't need to be a lot of sort of tricky discussion. There will be discussion. I defer to James and other experts in Mirror Life about precursor. There are so called mirror macromolecules. There might be something there, there, but anything very close to proper mineral life absolutely refuse without question.
Rob Wiblin
So as I said earlier, you think that an important project is trying to develop broad spectrum vaccines and broad spectrum antivirals that we could stockpile or deploy ahead of time that would give us protection against new bioweapons that might create it from that virus class. Tell us more about why that stands out to you.
Richard Melange
Yeah. Well, let's first think about the current strain specific paradigm. A sequence appears. We don't know whether it's been engineered. We don't know whether it's naturally mutated. We don't know where it's come from. There is now a rush. This is often called the 100 days mission. There is now a mission to get this down to 100 days. There is a rush to deploy a vaccine as quickly as possible to develop and deploy one. It turns out this is not a quick thing to do. That is why there is this international 100 days mission to get it down from something more like, like several hundred days. Covid was quicker than ever before. People were really surprised that between serious detection in sort of February, before lockdowns were necessarily announced, vaccines were deployed before the end of the year in Israel and then in the UK. This was great, but it wasn't in three months or a little over three months, 100 days. Instead I am much more excited about an emerging technology that people in the field really know about about already. But it has not necessarily received enough policy attention. It deserves Multi strain vaccines. And so these would be ones that you can construct them in different ways. You can put lots of different bits of virus on them in one go, or you can make it so that they target something that is really common across many different viruses in a family. But these are vaccines that work against lots of viruses. So it's not just SARS 2 but also SARS, but also MERS. These are all different sarbicoviruses, or it would work against lots of different pox viruses in one go. There's a great example of this with the MPOX outbreak that we saw a few years ago. Governments repurposed their smallpox stockpiles that were there for biological weapons attacks to deploy because they're both pox viruses and there was shared immunity. It wasn't as good as a targeted MPOX vaccine, but it was enough. But like, okay, sure, okay, you have multistrain vaccines. Why do I care? The reason you care is that if you can stockpile this in advance, I think again, this is really important for deterrence, but this is where you have to make tricky strategic trade offs. Because that sort of stockpiled vaccine that's good against all flus or it's good against all sarbicoviruses or coronaviruses or pox viruses will necessarily not be as good against any individual one. But sometimes that's not what you want want. If we are in a war with an adversary and it could be a state, it could be an AI system, it might not be that what we want to do is wait for a 90% plus effective vaccine, so have months of vulnerability and then try and deploy across the whole country. It might be that we need to deploy the armed forces straight away. And the concern is that if they are all suffering and a good chunk of them are hospitalized, suddenly you are much less effective at your ability to respond to other concurrent threats, maybe kinetic threats from conventional military forces. The same with essential workers. If half of all the people you employ to transport and deliver food around the country are in hospital at the same time, suddenly aspects of how society functions and how society can respond to other extreme risks, risks breaks down. And so this is where I'm most excited about this sort of multi strain stockpile idea, because I think it is the only thing that makes sure we can keep society going in the most extreme cases and in fact deter adversaries from even trying something. Because then we can say no. If we've got a multi strain influenza vaccine, not only there are benefits for ongoing flu seasons from year to year, but Also we can say if you try and engineer any flu against. We are confident that yes some people will be sick but our armed forces will be pre vaccinated and then we can boost to them again and they will be able to respond and our essential workers. So everyone else will lock down but we can get food, we can get medicine to them. Society will continue functioning.
Rob Wiblin
So we won't be brought to our knees basically.
Richard Melange
Precisely.
Rob Wiblin
So presumably people have been interested in broad spectrum antivirals and broad spectrum vaccines for a long time. It's a great idea for obvious reasons. But why do you think it might be actually technically feasible in the near future future in a way that it hasn't been so far?
Richard Melange
Yeah, there's just been such exciting work in the last few years. Shockingly also from the Baker lab, David Baker and colleagues. They really are. And so actually I think this is where funders in our space so then open for anthrop bringing out coefficient giving disclosure who do. Yeah, I think partially fund Centre for long term Resilience. They identified this years ago and invested in the Baker lab and a few other labs to build universal vaccines. Basically engineering has got much better. But also we've just had AI. AI has just come on board and is starting to revolutionize lots and lots of subdomains of biological science. We've already had the structure prediction moment. We've basically solved that. Let's make the next one. We do vaccine design. There are a flurry of papers very recently. It seems the sort of thing was just a sort of pipe dream 10 years ago. But now there are multistrain vaccines in phase two, I think even one in a phase three trial that are promising. They're not all quite working yet. But it is no longer just a sort of pre clinical pipe dream. It's something now that is ready and investable.
Rob Wiblin
Okay. So the basic idea with the broad spectrum or universal vaccine would be that you give people a ton of different antigens or you expose people to lots and lots of different, different flu like shapes in the injection and that will prompt an immune response against many different, slightly different, what do you call it? Antigen. The proteins that are outward facing on the virus.
Richard Melange
So that's one way of doing it. Yes.
Rob Wiblin
And I guess another one might be to do MRNA and you put in lots of somewhat different sequences.
Richard Melange
Yeah. So you can either teach the human immune system to have lots of. Yeah, you put in lots of bits of viruses across the whole family at once. So it learns just to know about lots of different viruses. This is closer to Sort of giving many vaccines in one go. Or you look for some shared part of all the viruses. There's a reason that all the pox viruses are called pox viruses because they have some conserved heritable parts of their sequence that make them identifiable as pox viruses versus influenzas or something else.
Rob Wiblin
But if there was like one part of it that doesn't change, why weren't we targeting that already?
Richard Melange
We were. But often it is very hard to target some of the most conserved areas. Often the most reactive bits of the virus might be the bits that are mutating fastest. Think about in coronavirus you see that sort of famous picture of a sphere and all these spikes. It is the spike that you often target because it's outward facing. It's outward facing, but it's also the spike that mutates the most because evolutionary for the virus, it wants to mutate so it can keep infecting things as immune systems adapt to the previous iteration of the spike. So it is this running game, sometimes called red queen, where there's an offence defense balance again and we have to keep running to keep up. But without being a sort of deep expert, I am not a vaccinologist. I am assured by people who are that finally it is getting promising.
Rob Wiblin
Okay. And I guess with the antivirals, I mean we have like shockingly few antivirals I guess to start with compared to anti antibiotics. The idea there would be that, I guess that you're like engineering proteins that have the right shape and again they tie up target conserved parts of the virus that it's going to have a very hard time changing because I guess they're functional. So if they change the shape then the function would break down. And I guess you could try to do lots of slightly different molecules all at once. The same idea with the vaccines where you do like many at once.
Richard Melange
Yeah, I don't deeply know about this. I think there's also other ways where you're thinking more about what products could you use to teach the human immune system, often the innate part of the human immune system to just recognize viruses in general. General. But it's part and parcel the same thing. You're right that there's very little work on broad spectrum antivirus. I mean there has been work but very little success maybe to date. I think one of the most exciting programs is that of Brian Wang at aria. So the UK Advanced Research and Innovation Agency. But that is a moonshot program explicitly. It's high risk, high reward science. Most ARIA programs aren't meant to deliver for at least 10 years. And also many of them meant to fail because of their high risk, high reward. I really hope Brian succeeds. But that is something like if we're relying on 15 year moonshots, this is a really important thing that we should be doing. But we can't assume that that's the only thing we could do because timelines, AI is arriving quickly. We might have AI enabled biological catastrophes long before we have broad spectrum antivirals.
Rob Wiblin
Okay, just to come back to the concern I had with this whole approach earlier, we're imagining here that we're in a world with significantly advanced biological tools that are being used to enable the creation of these broad spectrum antivirals, broad spectrum vaccines. So it's a lot easier for people to go from a function to a sequence to engineer specific viruses, I guess, that have particular characteristics. If you're up against very capable, like an intelligent adversary, Aren't they going to find out what antivirals you've stockpiled, what, what vaccines you've given your armed forces, and then basically just cherry pick a design for a virus that isn't captured by this one? I mean, they might have a hard time, but they can just keep searching until they eventually find something. It's this like defense offense thing that they can just keep searching until they find a weakness. And then if I suppose you would say initially, they're not going to potentially have that capability because they just don't have the same resourcing unless they're a state, I guess, as the defenders, where there's just a lot more money and a lot more people who are interested in preventing the disease than causing it. Then eventually, I guess, they might catch up and they might be able to find the one virus design that evades your various defenses. I guess by that point, perhaps we might have built up the manufacturing capacity and the scientific capacity to very quickly respond to any new threats, to design a new antiviral and to manufacture it en masse and distribute it within the hundred days or whatever. And so maybe again we achieve some kind of balance that's tolerable. Do I have the right. Right picture in mind?
Richard Melange
Mostly, I think there are a few different points there. And I appreciate what you're. It sounds like what you're saying is, wow, Rachel, this sure seems a shaky plan. Got anything more concrete? Anything more robust? Fair enough. I think there are some reasons to be less pessimistic than that. So the first is, I think you subtly substituted for two different capabilities. You said, suppose we have great vaccine design capabilities. We'll also probably have really good sequence to function design time. The first is a much more narrow subdomain than the second which is a deeply enabling cross biology generalized capability. We can have alphafold structure prediction without having. That's actually quite general, I suppose. But we don't have yet to have sequence to function. And so this I think links back to the idea of differential technological development. We should be choosing to develop vaccine design technologies as far as possible without always relying on broader dual use biodesign technologies. If we don't have to. Sometimes we will. But we should squeeze all the defensive juice out of a defence dominant subdomain first. The second thing you said was then maybe they could optimize against you. I think this is an open question. If we have truly multi strain vaccines or universal vaccines within a class, it should be that. Think about that embedding space of biology. Again, if there's this circle here that's all the influenzas, at some point you might have a vaccine that works against all of them. That doesn't preclude very advanced actors from starting to Explore. There are 26 human infected viral families that we know of. Could you go into the 30s or something? Sure. But that still cuts down so much of the risk. That is clearly a thing worth doing. So I want to push back on the idea that it was would always be possible to counter. I'm excited that eventually parts of this might be defense dominant. Then you talked about very excitingly like a very advanced manufacturing capability. We at CLTR have been thinking about this a lot more colleagues than myself. So we can put up links to the relevant papers in the show notes. But you're right that advanced manufacturing capabilities can be really, really important. Important. I wouldn't rely on it though in the sense that it's going to take a very long time to get this online. I think there is this sort of golden world where we have always on flexible local capacity that every gp, every doctor's office in the country for the Americans, you can just press a button, take a blood sample or something, press a button and then get this optimized personalized vaccine. The moment this happens, happens. And this is spread all over. It's a decentralized system. That would be amazing. But baby steps. That's a very different world than the one we're currently in.
Rob Wiblin
Okay, let's talk about another technology we might want to differentially advance. Yeah. You and your organization, including my wife. I was a co author on this one. You recently published a cost benefit analysis suggesting that screening DNA sequences that people have requested to synthesize for scientific purposes, that screening them to make sure that they're not dangerous one way or another passes a cost benefit benefit test, even if the UK goes it alone and only does it for sequences that are requested in the uk? Well, I think people won't understand, given the threats that we've been talking about, given how many people could die in a biological catastrophe, why in general it might be a good idea to screen. But I think a common objection has been, well, if only one country does it, then a bad actor will just request the sequence in the mail from another country that isn't doing the screening. How then can it be that it passes a cost benefit test for the UK to go it alone and potentially do it even if maybe other countries don't follow its example? Example.
Richard Melange
It's a very good question. I should clarify. Yes, I'm not on that paper, so I will do my best to represent my colleague's work, but maybe you should get one of them also to talk about synthesis screening in the future. So yeah, my understanding is the model, and it's all publicly available on the website, including sort of the underlying economic modeling and all the assumptions that the team put in, so anyone can go and check. But the model assumes only a very small proportion of all the sequences that, that the world might order, that companies and academics around the world order, come from the uk. So it was explicitly baked in that even if you're only cutting a tiny bit of the malicious ordering space off, that this would still give benefits. Now you're right that there could be a world where there's absolutely zero percent, that the UK alone has mandated synthesis screening. So no one else tries to. So we already have some voluntary screening. There's this organization, the igsc, the International Gene Synthesis Consortium, whose industry members agree to all screen to a certain standard. So we have that and people still order in the uk. Maybe you're saying, oh well, it's none of the bad people, it's only the good people. But then I would come back to whom are you trying to stop with gene synthesis screening? It's not states because they have local synthesis capabilities themselves. In most cases they're not necessarily going to be ordering from us or a UK company. Instead it's often your so called highly capable individuals. And so in the UK, thankfully 99.99 have many nines of people with lab access in the UK are I think fundamentally good people who want to use science to help the world. However, I Assume that the number of people in the UK who wish to cause harm in this way is not zero. They might order from a UK source because especially if you imagine you're in a lab and you usually order from one kind of company and then suddenly your colleagues notice you've ordered a different one and people like, hey, well, why is this thing in the mail already? There are like questions of obfuscation, there are questions of counter surveillance that get to the point that there might be reasons that someone in the UK would want to order from a UK firm because it is easier for them to access that domestically and to disguise what they're doing. I agree it's only a small proportion of the world, but also you might want to do something on site in the UK because you want to deploy in the uk and you wouldn't want to have to build your virus in one country and then fly it over. It really depends what the threat model is. But there are reasonable reasons. I'm sceptical of this claim that if synthesis screening were mandated, 0% of adversaries would ever order from the UK again, that magically everyone would just switch. Thankfully also terrorists make mistakes sometimes.
Rob Wiblin
Well, I guess they have a very strong record of making mistakes.
Richard Melange
They really do. But yes, the underlying points are this was accounted for and heavily discounted in some sense in the cost benefit analysis. But also there are reasons to think that this would still be capturing some malicious activity.
Rob Wiblin
Yeah, I guess the net benefit explodes in this scenario. I guess it's kind of a closer call if the UK goes it alone. I think it was within the error bounds that perhaps that it wouldn't be worth it. But if the UK demonstrates that this is possible to do, technically, that it's not too expensive, not too impractical, then I suppose people have talked about this since, I guess I was in high school and studying biology. I remember people were worried about the risk from DNA synthesis centuries ago when I was in high school. But I think interest is really increasing around because all of the things that we've been talking about, the risk of people successfully pulling something like this off is going up. So the chance that the EU and the US and China are all interested in emulating a successful program like this seems meaningful. I guess certainly if we have like a near miss scare or something like that, then you could really see people rushing to do it.
Richard Melange
Yes, I think it's really important that many jurisdictions consider doing this and I'm optimistic that there's a lot of tailwinds behind this effort. In particular, in the uk it was a commitment in the UK biological security strategy to something like deeply consider whether mandatory gene synthesis screening was necessary. They already have guidance that companies follow and I'm excited to think that this sort of cost benefit analysis is important evidence, I think supporting the decision that they should mandate it. More than that, the eu, in the recent sort of first draft of the forthcoming Biotech act, has included mandatory synthesis screen training as an option. And the United States has actually, I think, been ahead of a lot of countries for a very long time where they've only ever had guidance. But their voluntary guidance is enough that federally funded research, which is an awful lot of research both domestically in the US and internationally, synthetic acids associated with that research must be ordered from responsible providers who screen. I think that's a hugely positive thing. Anyway, that has really shifted the needle and so. So I'm pretty excited that the pieces are all in place for, I think, a shared international mandation of this really quite cheap technology. And that would suddenly cover an awful lot of the risk service.
Rob Wiblin
And the reason that it works even if you're not at 100%, is that imagine that you're a postdoc or something at a university lab and you're hoping to do something surreptitiously because that's very dangerous. But almost all of the main players are screening and then you order from one of the handful from Venezuela or something that aren't doing it, then that's super sus to your universe. If, well, maybe they'll just block you from getting anything from there or from bringing anything in. And I guess alternatively to the intelligence services, they might well be able to find out that you're doing this and that puts you on a watch list, basically.
Richard Melange
This goes both ways. It goes with the companies as well. If governments start mandating screening, then more and more companies will think, oh well, we've probably got to go and join the International Gene Synthesis Consortium now, because they're the ones with the best practice who can support us in implementing now this regulation. Any company that didn't do that would be officially saying, I no longer wish to offer my services in the uk, the EU or the US or maybe China, depending on who did it. Well, that's why on earth don't you want to do that? Those are giant markets for modern biotechnology. So not only would that be a huge commercial mistake, it's also a big super suspicious. Super suspicious. And I would look forward to colleagues in the intelligence community taking a very click interested Any companies who would be foolish enough to do that.
Rob Wiblin
So I guess the biggest risk or the biggest challenge with this is that as the AI tools get better at obfuscating what proteins are, where they come from and what they might be used for, in order for this screening to be highly effective against a smart adversary, you need it to be extremely good at figuring out whether a given sequence is potentially able to contribute to a biological weapons program or to some extremely dangerous pandemic. And then that model is itself is an information hazard because it's basically a measure, basically a meter for determining whether something is dangerous and harmful or not. So within the defensive acceleration framework, why isn't this actually not advantaging defenders creating something that basically can tell you whether something contributes to a biological weapons program? Seems like it might advantage offend adversaries?
Richard Melange
Yeah, that's a really good question. So you're right that the underlying capabilities of takes in sequence gives, let's simplify it, a binary yes, no, yes, biological weapon. No, not a biological weapon is sort of dual use. And I think this is sort of offense defence neutral because you can use it to deploy for screening or you can gradient descent on it and you can be searching your sequence space and every time you get yes or something closer to a yes, you can optimise your sequence further until you're maximizing the probability that your classification system system says yes. However, that does not mean that the infrastructure that goes around building screening is not defense dominant. If we are in a world where we have this dual use offense defence neutral sequence to function prediction, then it will be a very bad world if we do not have the surrounding infrastructure, both in technical, but also sort of regulatory and governance related to actually use that for defensive purposes. Using that classifier to classify orders is a defensive use case. And so you're quite right that the actual sort of DEFAC investment is getting us to a world where we go soon we will have brilliant sequence to function prediction. When we do, we are totally ready. And it's also getting us to a world where soon we will have brilliant sequence to function prediction. We are ready to collect information on those who might wish to misuse it and in fact disrupt their activities. And both those aspects are defensive are defensive sort of wrappers around what is ultimately an offense defense neutral technology. Does that make sense?
Rob Wiblin
Yeah, it does. So I think the DEFAC idea, let's not slow down technology, let's speed up the stuff that is good that advantages defenders is a very, very attractive framing, a very attractive mentality because it allows you to on the one hand I guess address your safety concerns and your anxieties without seeming like you're anti progress and anti technology and you're a doomer or something like that. So I think many people have latched onto it, including me, I guess. How much of this stuff is actually happening though? I worry that it's such a nice idea that people talk about it a ton. But then are actually many people going into DAFAC projects. Is it attracting the talent? Is it attracting the funding that is it needs?
Richard Melange
Yes. No, we're barely starting. I think we're in the foothills of the Deafac mountain. So a good example I think is Blue Dot Impact, which provides sort of courses where people can go and learn about AI security and biological security topics and skill up. And they've now started a big new program specifically around defensive acceleration. There have been lots of hackathons where they invite people and, and to go and create new ideas. And now they're going to be funding some of those best ideas. This is really exciting. I mean it's also like one not very well heard of company. I think we can do better beyond that one small company. There's been a lot of talk, but I'm not sure there's been as much action. And so I want to really be fair here. I want to talk about. Yeah, how the frontier AI companies are thinking about this and not to pick on them particularly, but I think there's something that OpenAI said that was very important here. So they'd recently released back in 25 their first model that looked like it was higher risk on the biological and chemical domain. I think it was chatgpt deep thinking or something similar to this. They had a blog post where they said we think our models are going to materially increase biological risk in the world and not just theirs, but the whole industries. And so we think it is really, really critical for society and for natural security that we invest in better biological. Biological security. I was so on board up to this point and then they actually said I think it's really important to have DNA synthesis screening. I thought, oh, that is important, but that's not something that you do. That's sort of something else someone else
Rob Wiblin
might do to someone else.
Richard Melange
Yeah. And they started saying we really think it's important that biosecurity people really accelerate lots of other things. Sort of like, okay, but what are you going to do about it now? To their credit, they probably announced they had a sort of a conference or something where I invited top experts, I think mostly in the US to sort of talk about what they could do. And they promised also they were going to be keen to accelerate Defenders. And I think they have a web page somewhere where they say you can apply to us to get maybe more compute credits or to get unsafeguarded models. But I know of people who work professionally full time on biological security, often AI related to AI as well, who have been trying to get this sort of stuff, not just from OpenAI, also from the other companies. It's not just on the them. And the response, I think has been somewhat lacking to date. I am not seeing enough think tanks, startups saying, and we're so thankful to insert your favourite company here for providing all this free compute. And it was so important that we got this unsafeguarded model that we used very responsibly to get this new biological defence system. And so I'm really concerned that there's just been talk up to date. It's great that people have recognised that this is part of the problem. But part of the reason that I wrote some of these blog posts back as part of the asterisk sort of blogging fellowship that was the catalyst to do this was that everyone was saying it was important. No one was specifying actually how to do it.
Rob Wiblin
Yeah. What do you think? What is the barrier to getting more of these projects happening? I guess I could see three main ones. I guess on the government side, just governments are unwilling to spend money. I guess like the UK in particular is like very fiscally stretched. Oh yeah. But it's always difficult to bid for large budgets for some science. I guess there's also just the bandwidth to even think of like governments are dealing with all kinds of different things. This is not a threat that has actually happened yet. So it's maybe hard to get as many staff as you might like to even be, considering what the response ought to be. And then there's also, I imagine the experts in this area, especially ones who both are good at the science and are highly entrepreneurial and could try to own a project end to end, are surely in enormous demand and persuading them to work on one of these DEFAC AI bio projects, projects. I mean, I'm sure it's competitive, but it's like a hard sell because there are many things that they could go and do. Do you have a sense of what the bottleneck is?
Richard Melange
Yeah, I think there are a lot of bottlenecks and I think you said them. I think a lot of it is going to be people, which is exciting because I think there's lots of Great latent talent out there. A very good friend of mine, he finished his PhD at Cambridge as well last year and now he's going to fan found broadly an AI bio startup. And I think this is exciting. But he's one of very few people I know, I know of a few other startups that have just been announced in the last few months. They're all very small, they're probably mostly in stealth. This is not the response I would expect from a society who is going, wow, this is one of the defining national security challenges of our time. We will get there. I hope it's not only a biological attack that makes us get there, but there are people and so I'm excited by work to have hackathons, ways of sort of doing pull mechanisms to give promising people a little bit of funding eventually. However, a lot of this is that the reason it's hard to have industry work in this space is we come back to sort of general market failures around biosecurity. At some point the buyer, the one buyer of this product is going to be a government and probably the government is not going to intend to ever use whatever thing you're using you've built for a long time. It's only in an emergency unless we can be building products that are always on, like biosurveillance. I think we should be doing this because biosurveillance allows you to understand the spread of disease in your country all the time 24 7. As well as spotting if something engineered comes in. You just have a heat map of where are the hotspots of influenza, of COVID or whatever you like of RSP be. There is ongoing economic benefit, but it's often hard to find those ongoing economic benefits in a lot of biological technologies.
Rob Wiblin
You have a really nice blog post on your substack where you go through 15 different DEFAC projects that you'd really like to see the UK and I guess the US as well get on top of and advance faster than is currently happening. So people are keen for more ideas for science that they could go do or I guess companies even that they could start, then they could go and check. Start by looking at that blog post.
Richard Melange
Please do.
Rob Wiblin
An audience question was I hear lots about Silicon Valley startups doing biosecurity including Valthos and Red Queen Bio. Many of them claim to be targeted at AI bio. I'm interested in assessments on how much risk these startups reduce. I guess I should say so. Valthos aggregates biological data from commercial and government sources to identify emerging Threats and assessments risks and Red Queen works with Frontier Labs to map AI enabled bio threats and pre build medical countermeasures. Yeah.
Richard Melange
So I'm excited that these companies exist. It is strictly better that I think we have some companies that are deliberately trying to deal with catastrophic AI bio threats than not. I think it's a good thing I won't speak to these companies in particular because I think it's sort of unfair. I don't know loads about their work and, and they are just one of many. But do we need this sort of industry work in general? Absolutely, yes. How optimistic should we be about them succeeding in their missions? I think I would always just sort of recommend to people that what a company puts on, especially on a launch website, often will be perhaps more general and more optimistic. They may end up pivoting lots into something narrow and more specific that nevertheless still creates a lot of beneficial societal value. I think there are huge opportunities for AI to speed up say the development of medical countermeasures. This is why a lot of people have been calling for making the 100 days mission AI enabled. We can use AI to design sequences faster, to process data faster, to share data faster, but nevertheless you can't get around a fundamental blocker which is often. It's actually the clinical trials that really slow things down down. I think there's been some great recent work from the Institute for Progress you may wish to link to that talks about how clinical trials are a blocker. Famously in Covid the AstraZeneca Oxford vaccine, one of the first vaccines to be made. People disagree exactly how long it takes, but people talk about it being taking a week or maybe just a weekend even to get the design, but then building, you know, mass producing that vaccine and especially testing it. Safety testing it with in clinical trials files meant that it was only deployed something like nine months later, give or take.
Rob Wiblin
Isn't this quite a fundamental problem that safety testing just takes time because you have to put it into human beings and see how that plays out. The fact that AI is getting better doesn't allow you to speed that up so enormously. I guess you can do like in silico simulations, but that's not going to be fully persuasive or reassuring to people. So they might just not be willing to take these medicines unless it's like a really extreme situation.
Richard Melange
Yes and no. You said you could do in silico simulation. I want to be open to the world where you can alphafold this thing in the sense of we can get to in silico simulations that are effectively experimentally accurate. I think I'm excited about up to a point toxicity prediction models that really allow you just to know well in advance this sequence is safe. But as we will talk about in this podcast, that's hugely dual use. If you can predict whether something is likely to be harmful, that can be misused. But this is another example of a dual use technology that we do in fact have to deploy defensively, but we have to think very carefully about how we secure it. I think there's also exciting work on, say, challenge trials. The organization One Day Sooner is probably one of the leading lights in that part of the field where you could be testing these drugs on healthy young volunteers in a crisis who know that they or their loved ones might die if you don't get a vaccine sooner and are willing to take additional risk. Risk much like members of the armed forces are willing to take additional risk
Rob Wiblin
for society hey everyone, Rob here. I just wanted to let you know that the 80,000 Hours podcast is looking to hire some skilled video editors to help us produce more episodes and to turn them around faster. Someone is most likely to be an especially good fit for this if they have great attention to detail and will find editing long form content satisfying rather than tedious, if they're familiar with long form podcast or any interview content in general, if they have experience working in a team collaborative editing environment and or if they have some experience using DaVinci Resolve, which is our primary editing tool. Remote is fine. It's a part time role with the potential to scale up to full time later and the pay would be approximately 40 to 74 US dollars an hour depending on someone's skill and experience. For more information, go and check out the expression of interest on our website which we'll link to in the episode description. All right, back to the show Show. Another question from the audience was someone who asked I'm interested in the bull case of biological design tools. How much progress would we forego in virology and other medical fields if we chose to regulate them?
Richard Melange
That's a very important question and one that I don't want to sort of underestimate. It must be considered carefully. I'm excited that I think most progress in sort of AI enabled biology should be able to continue you totally unabated. And the example is in my PhD I did cancer transcriptomics and I built AI models to help us understand, diagnose and treat breast cancer. I struggle to find misuse relevance in my work. Well, I think my work wasn't Very good. It was a pretty. I was just a PhD student, but in the field more broadly, I think it is just pretty much all upside. And so this is something that can sometimes be overstated in AI. It's not just every bit of AI enabled biology on the sun that will immediately lead to pandemic, pandemic viruses. Unfortunately it is a very small subset and that subset is predominantly in virology and the most dual use parts of virology working on pandemic viruses. This is tricky because it means that there will be a small number potentially of bigger losers, people who will really feel constrained in their work, even as there are many people who will not feel constrained. But they are sort of, they win, but their winning is diversified. And this is, this is sort of the setup for a classic market failure. I would say look to the physicists. Most physics can go unabated, but because of the events in the 1940s, nuclear physicists appreciate that their work is just incredibly powerful and deadly. And so we need to have responsibility there. I know lots of virologists who deeply care about being responsible and appreciate that actually in, in fact a little bit of their work might need to be constrained or they might just have to prove that they are responsible scientists at a legitimate institution in order to carry on. But I want to sort of reassure people that I think most AI enabled life sciences. It's just upside and really. Let's go.
Rob Wiblin
So Andrew Snyder Beatty and his team at Coefficient Giving who I interviewed a couple of months back, they look at all of this entire picture and they think that the main way in which bio is defense advantaged or the main defensive technology that you can push forward that makes a massive difference is personal protective equipment and bio hardening. I guess it's like a bit of more of a dire picture because this is a world of which in response to something going very wrong, we're going to have to wear face masks and I guess be like sanitizing the environment to an extreme degree that we never have before. But Andrew Snyder Beatty made the case in this episode. I guess if people want to go back and listen, it's a very good one. Episode 224 Andrew Snyder Beatty on the low tech plan to patch humanity's greatest weakness. He made the argument that the main way that the main weakness, I guess that someone using bio for offensive purposes has is that these things are very small. They have to spread between people in order to infect them. And we can simply insert physical barriers between people that make it extremely difficult for viruses to go from the first person who you might deliberately infect to get to the rest of the world. What do you think of that take? Because I guess PPE and bio hardening hasn't been such a something that you've flagged as yet.
Richard Melange
Yeah, I think that's very fair. I think I broadly agree. I think I would distinguish between PPE and biohardening because I think they're subtly different. Biohardening is especially good because eventually we might be able to cut to a world where we just don't have viral pathogens in the air. Regardless of whether there are malicious actors in the world who wish to build biological weapons of mass destruction. And that alone is a brilliant thing. Imagine if we could could radically reduce the amount and the spread of communicable disease in the world if we could just sort of get rid of most respiratory viruses. And I think that thesis is very investible. There's lots of reasons, even ignoring all the stuff about biological weapons to go towards a world where we better control the built environment. PPE I think is really, really important. And I do talk a little bit about this in a forthcoming paper that might be out by the time this show is up on defensive acceleration. I do think there are loads of fac opportunities for ppe. We can be basically getting the best kinds of elastomeric respirators faster and cheaper. Someone is going to build the companies that do this. I think governments and industry should just race each other to see who can do it first. But it's always a thing that PPE is the sort of thing that will only pay off off in a pandemic. You are going to suffer from government and industry up to a point, short termism that no one wants to buy the thing that they hope never to use. You'll be talking about stockpiling these things for 20 years and renewing it and hoping there was no pandemic in between. It is sensible. You and I know that that is in fact what people should do. But it's going to be a much harder sell.
Rob Wiblin
Are there any important disagreements? I guess friendly disagreements purpose between you and perhaps the people at center for Long Term Resilience and the team like Andrew Snyder Beatty and his team at Coefficient giving on these general questions. I guess one that stands out. I think you've often said that you think bio or you said through this conversation and in writing that you think BIO is offense dominant. And I guess ASB has been trying to make the case that actually it's defense dominant because of this like physical barrier issue, is that a real disagreement or just a difference in focus?
Richard Melange
Yeah, so I am partially persuaded by this. I think the, the defence dominance of built environment intervention was the strongest counterargument. So I'm totally there. I think we do have the friendliness of disagreement though that in the sense that I really would not underplay that. To date everyone who's ever thought about biology has thought that as a domain of warfare it is offense biased. I think a lot of that Rand paper that I mentioned earlier that went through lots of different ways biology might be offence or defense balance. There is a reason they came out four out of five offence. It is also true that it is much cheaper to build pathogens than it is to defend against them. Even in the built environment case we're talking about refitting every building in the world.
Rob Wiblin
Sounds expensive.
Richard Melange
A little bit, yeah. I think that is the same sort of issue as the oh, giving a vaccine to everyone in the world kind of thing. Now there is a nice difference that you can do it in advance, just like you can in fact stockpile vaccine or even vaccinate people in advance. We do this with some diseases. There's a reason we don't have typhoid or diphtheria very much, at least in rich countries, because we can just totally vaccinate, preemptively vaccinate our way out of it. And so I think this was a great challenge for me and I'm excited that there are potentially more avenues. But just for your listeners, please don't underestimate that this is a tricky domain. A lot of offence dominance still abounds.
Rob Wiblin
Yeah, I should say, I guess ASB was talking about a four pillar plan, the first one of which was surveillance and detection of diseases and the fourth one was medical countermeasures. Which are the other two things that you were talking about.
Richard Melange
Yeah, very much so.
Rob Wiblin
Obviously a ton of overlap. What would you like the AI companies to do differently?
Richard Melange
Lots of things. I'm sure I'll stick to AI bio. I think there's lots of things in AI safety I would like them to do differently. So yeah, I think I might have mentioned it before that privileging defenders, trusted tester schemes. There is so much low hanging fruit here. It is sort of unacceptable that the people that you can pre identify as those working on the most pressing catastrophic biological security challenges in the world. And I do not just mean me here, I mean people who build evals, people who are building actual countermeasures are not routinely getting the best AI before anybody else and learning and having a feedback loop to deeply integrate it with their work. This seems a solvable problem. So that's one thing. But there are other things around, I think improving threat modeling. We've talked a little bit around. We went through different threat actors earlier in the discussion. I think some of that work could still be integrated into frontier safety frameworks. Another one is taking up best in class classifiers and safeguards. Not every company safeguards against BW discussion as well as others. And another one is really engaging a little bit more with. I think they do this really well up to a point with the agentic paradigm. So AI companies are really very engaged on this. They know that agents are the other big thing. And we've seen this a lot with Claude 4.5 opus with code in December. That was a bit of a. For all that people were like, wow, this is finally. Dean Ball said it was AGI. I don't think it was AGI, but it's very cool. It's very cool. It is genuinely very cool. So they're really up on agents. But there are, I think, some interesting things around agents that change the existing thinking on AI bio in particular. We've been thinking about static guardrails on individual tools. But we will move to a world potentially where not only are agents wielding many tools together, they're putting them together, they're talking to the user in natural language. We may get to a world where they build tools. And we see this with agentic coding. If you want to achieve a particular function, you don't wait for an academic to release a paper on exactly that function. You just ask Claude Code or whatever model or OpenAI codecs to code it up for you. This is going to happen in biology too soon. We will have a world where not only do you grab tools off the shelf and put them together, but the agent in fact just summits from the ether with a very little bit of compute. As long as the data is public, new capabilities that have never existed, personalized capabilities for that particular person, whoever that means, there's going to be lots of beneficial stuff here. That's great. It's called coding. I'm not saying anything profound in some sense, but I'm not sure that both the policy community and the AI companies have really engaged deeply with what that means. There are real positives here. It could be that everything goes through agents and then you have a sort of bottleneck that as long as you secure the agents, by definition, you secure the biological tools that soon they will be the ones doing most of on which they do most of the creation but on the other hand it's all eggs in one basket. I have not seen sort of deep discussion yet and it's not just the companies but from across the whole community companies, government and civil society around. Are we sure that the things that we are doing to defend against these things threats are robust to a world where everyone just has natural language agents that just automate bioinformatics for you?
Rob Wiblin
Yeah. A question from the audience was all of the major AI companies have bio evals. How good or trustworthy are they?
Richard Melange
In some sense they're very trustworthy because they don't make most of them. So most of the top evals biovals in the world, at least the ones that acknowledged publicly in mobile cards are made by third party companies companies secure bio particularly it makes some of the best in the world. They made virology capabilities test Deloitte, a consulting group that absorbed what was Griffin Scientific which is a leading US sort of biological security firm. A lot of folks with deep national security experience there, they do great work. There are lots of others. RAND us, the center for AI Security and Technology in Rand have a strong biuvals program and also so the UK AI Security Institute and the US Centre for AI Standards and Innovation also do evals and red teaming. So in some sense it's pretty good because they've got some of the world's experts doing it. On the other hand it's still not enough because this is just a fundamental problem with eval with biovals they are so proxy succeed we can never do the actual thing that is scary in cyber you can go much closer to the real offensive activity because you can set up cheaply a digital environment that exactly simulates what it would be like to attack a particular company. You can even draw on past instances. We can say well this company was set up like this in 2023 and this was the vulnerability can AI find it? But it is unethical and in fact fact illegal to do the equivalent direct thing with biology. And so we always have this fundamental limitation of biovals. This is where companies have turned to uplift studies but very few of them do it. So OpenAI did an uplift study a little while ago now and it was a bit tricky because it was reported as a negative. Oh no, we didn't really uplift and I think at the time it was something like GPT4. I'm not sure it was an earlier model. It probably didn't uplift very much. But actually they broke down every way it could have uplifted someone. And all the uplift was all very slight. And sometimes you were uplifted, sometimes you weren't on a particular aspect of biological weapons development. And so then they said, overall, oh, we haven't got strong evidence that you were uplifted. But if you actually amalgamated all the signals of evidence into sort of a single P value, suddenly it said it
Rob Wiblin
was clear that it was higher on average.
Richard Melange
Yes. Now I think it was still marginal. But that was the last one they've publicly done though. The Frontier Model Forum, which is a consortium, has invested in a new sort of big uplift study that hopefully should be reporting soon. And so the companies would say, well, yeah, we've just gone into that. But on the other side, Anthropic has carried on doing uplift studies. They're not very well reported, they're only in the model cards. But that's something else that I think is inconsistent. Uplift studies are a particularly expensive thing, but nevertheless provide more valuable information than Biovals. So on the margin, if you're a multibillion dollar company, maybe you should be doing the expensive thing that even governments and nonprofits will find it hard to do.
Rob Wiblin
I thought you were going to say that we don't even know how good or trustworthy they are because when they report the results in model cards and things like that, they don't actually give us that many, I guess for security reasons potentially they don't give us many details of specifically how they've done the experiment or all of the details of the results, is that right?
Richard Melange
Yes and no. So information is sparse in model cards, but it really depends on the company. And so. So I think there are companies who could give more information because their competitors do. And there doesn't seem to be any grand sort of security breaches, though I admire that general framework. You should be careful. I'm not as worried about this because some of the best evals are made by places like securebio. They are just allowed to go and report the results on public models themselves on their own website. So we still actually have a lot of insight through the third parties. There has been a lot of criticism about how they report that. They report sort of inconsistently in lots of cases. Are they comparing to previous Frontier performance or the best human performance or the best human team performance, or is it a novice? What is the baseline? There's a great paper called Stream. I can't remember exactly what it stands for, but it's from the center for the Governance of AI and they offer some simple guidelines on how to sort of not automate but standardize by eval and other eval reporting. I think that is going to be best practice. I don't think that's readily implemented yet and it be should, should be so.
Rob Wiblin
Another on point audience question was should we expect a terrorist use of chemical weapons or widespread enablement of other attack types significantly before we see bioweapons enabled by AI?
Richard Melange
We can quibble about significantly, but broadly, yes. There is actually great work from Luca Righetti's threat modeling team at the center for Governance of AI that I think has done more on this than ever yet else there are public databases of historical terrorist attacks and you can categorise them by chemical, biological explosives, radiological and nuclear as well, but much less so. And going through this you find that just chemical weapons attempts are just much more common than anything else, at least on the public record.
Rob Wiblin
And do they succeed more?
Richard Melange
I'd have to check. I think they do succeed a little bit more, but again just thankfully the vast majority of things just don't succeed still. So I do think that we are going to see, yes, AI enabled CW before we see AI enabled bw.
Rob Wiblin
Is that basically just because it's easier to do a chemical weapon attack because it's easier to make a poison than to make a living thing?
Richard Melange
Precisely. They're a lot simpler. And also this is where intent plays into again, there are more reasons you might want to kill a single person, maybe unobtrusively then you might want to start a pandemic to kill millions of people. And so not only is it going to be a resource question, I think it's also a question of threat model and why these people want to cause the harm they want to cause.
Rob Wiblin
So I guess precisely for that reason that chemicals don't spread autonomously, they don't replicate and go worldwide. The potential harm from chemical attacks is much, much smaller than creating a new pandemic virus. But I think you said in your notes that you think, despite the fact that the scale of the expected harm is a lot lower, that you would still like AI companies and I guess like people in the AI governance space to be paying more attention to chemical weapons threats than they are. Why is that?
Richard Melange
I would much more so. The reason is several reasons actually. So one is I think this is providing valuable information that we are leaving on the table about the sorts of people who wish to cause harm building wmd, especially non nuclear wmd, that could really inform our response and how we can make sure we have targeted guardrails that stop malicious actors without constraining beneficial research. If chemical attacks are much more common, then there's going to be a lot more information about people in the world already trying to use AI to achieve this. And that's going to be very relevant actually for how people might want to use AI to help with biological attacks. Because a lot of the things are very similar around. I want to act covertly, I want to get instructions, I want to beat the guardrails and jailbreak the model, I want to get advice about how to deploy things, I want to get encouragement and sort of be boosted by the AI to keep going with my horrible illegal endeavor. And I worry that we are leaving evidence on the table about how modern certain adversaries, states, terrorists and lone wolves might be integrating AI that could really inform the biological response.
Rob Wiblin
So the notion is that if we found a group or an individual attempting to do a chemical attack using AI, then we could study what they did and what went right and what went wrong and what might have prevented that and then apply that to the biological case where the scale of the damage is even greater.
Richard Melange
Exactly.
Rob Wiblin
And that's not happening so partially, but
Richard Melange
it's not very clear. So companies often say we are going to prevent CBRN attacks, but then they even acknowledge much more sort of quietly. We actually currently only eval for biological attacks. I think Anthropic, for example, explicitly says they don't evalve for chemical, but they do monitor and so monitoring is good. Great. But where's the sharing of the information? Because if you look at cyber, it's really great. I'm really pleased that Both Anthropic and OpenAI have now repeatedly been releasing something like a six month cadence cyber threat intelligence reports where they go and say we have observed threat actors in the wild doing these bad things with our models. With respect to cyber offense, here's what we learned and how we stop them. We think this is in fact a public good that we should share with everyone. But especially the government and civil society need to know it to be designing governance regimes. But also other AI companies need to know it because it's relevant for people misusing their models too. Do where is the equivalent for chem? I think this would be such a useful resource. There is a good argument that maybe secretly, maybe I'm just not seeing it.
Rob Wiblin
Not one of the cool kids.
Richard Melange
Maybe I'm not cool enough. And I think certainly for biology, I think that might be very fair. I think with respect to state uplift. I think again that might be fair though in cyber it's always been odd that people can talk about state activity much more than in chemical and biological warfare. Fair. But I really hope that they are at least sharing it with the leading AI security and safety institutes, with the international network as well, I think. But actually I would say sharing it with third party experts, the ones who are designing your evals for you, might be a really useful thing to do. I think there's another reason this is very important. Two even so, one is, is we've been talking a lot about transmissible biology and I quite agree that pandemic threats are the ones with the greatest scale and thus actually really do need to attract the lion's share of the resource in lots of cases or at least much more resource than they currently do. But I would not wish us to overly discount non transmissible biology. So I think anthrax is the classic example because it would be very concerning if AI enabled attacks that could could kill millions of people in say a large scale urban bacteriological attack. This would certainly be hugely extreme and if that capability was widely shared around states and non states in the world, that would be deeply destabilizing and awful for loss of life. Even if it wouldn't be quite as bad as a pandemic, I think it's still sufficiently bad that really you need to pay attention to it a lot.
Rob Wiblin
Okay, so the argument who's like the scale is not so radically smaller because the worst case scenario would be people figure out how to do these chemical attacks which I guess if they were pulled off like extremely well, could kill millions of people in a concentrated area and then you could see lots of copycats. If other people are inspired by that one instance of success.
Richard Melange
Precisely. And the other reason is just brutally pragmatic. You really don't want to be the company that enabled only a massive anthrax attack. Well yeah, that would be non transmissible bio. But even in the chemistry case, you, you know, X company's AI was used in high profile chemical terror attack against a school or a political center. I think just for reasons of risk management they would want to take this seriously. I applaud trying to have sort of prioritisation here, but these companies have billions of dollars. I in fact do not think it is an either or. They don't have to do C or B. They could be doing both.
Rob Wiblin
You should become a branding expert, Reg.
Richard Melange
Oh, thank you. Hey guys. Don't enable terrorists attacks.
Rob Wiblin
So pushing on from the AI companies to government. I think this week the talk of the town has been GROK and the image model and the various unappealing images that it's been willing to generate for people on X. I think my understanding is that in the UK the government has been extremely displeased with the imagery that GROK has been generating, but found that it doesn't really have any obvious legal lever to pull by which it can punish them or give them some incentive to stop this. They've kind of, I guess, been a bit stuck in the immediate term. Could the same thing happen if there was a biological threat or like a near miss or something like that? Is there a clear lever to pull in the UK and do you know, maybe even in the us? Is there an issue here?
Richard Melange
Yeah, this is a great question and let's be sort of just clear about this. Yeah, so. So grok, at time of writing, was repeatedly producing child sexual abuse material. It wasn't just something unpleasant, it was something deeply illegal. Yeah. And you're right that the UK government and other governments found that this is actually, in fact an activity they might wish to intervene on. Not necessarily even fully stop in some sense, but they wanted to have some sort of lever to pull and they didn't find it. It CLTR has done a lot of work in this space around AI incident preparedness, but also responding to AI security incidents. And there's a nice paper on our website from colleagues in the AI unit that went through all the different legislative options at the time that the UK had, at the time of writing, the sorts of things you could use and evaluated them. Would this actually help in an AI incident? And the overwhelming conclusion was almost all of them are not relevant. As for what the UK government might do in the future. Yes. So they said they're not going to go forward with a sort of comprehensive AI bill, but they haven't quite fully ruled out what they might do. I think it's always important to look at the rest of the world. We now have other regulation in California with SB53, in New York with the Raise act, and especially in the EU with the Euro act and its code of practice. And so it's not obvious that the UK should just be sort of duplicating those sorts of requirements in those laws. Would it really help anyone? Maybe they would get some more information on the margin, but they have strong voluntary agreements with UK AI Security Institute. I think UK AC has a great working relationship in lots of ways. It's not always perfect, but it's pretty good with the frontier companies. But emergency powers I think are interesting because they are different than other ones and that's where the UK should probably be playing. The US is absolutely right. The administration, I think, has expressed skepticism about overregulating AI and I think that there are valid concerns there. So any government thinking to add in additional requirements should be thinking not to duplicate but to complement existing work. And often probably things that are a lot less burdensome than some of the stuff we already have, especially from the eu. But be providing what is the added thing that no one else has that would really enforce fact, keep the public safe in a crisis.
Rob Wiblin
Okay, let's push on to a brief career advice section. If people in the audience have been inspired by what we've been talking about or, I don't know, terrified by what we've been talking about and would like to do something to help out. Yeah. Are there any kind of organizations that people could go and work out that particularly stand out or particular roles that you think people should consider taking if they have the right skills?
Richard Melange
I think there are a number. I'll start not with clti, which again
Rob Wiblin
is a place that my wife works.
Richard Melange
Oh yes, that's true. But I will say you work there, I work there, so I will defend it in a moment. So particularly excellent places, but this is by no means exclusive. So colleagues, please don't get me if I forget. Securebio and randcast are some of the two best places in the world for making biological evaluations and they do a lot of other work besides. But that's one thing. And some of the strongest people in the world who work on AI bio are leaders at those organizations, particularly also the security institutes. The UK's is the largest and strongest by far. The US is growing. They're doing more work with the frontier companies doing Red Team, they're hiring much more recently, I saw. I think that is excellent. I'm really glad that they are being given more money and authority to do that. But there are lots of other safety institutes around the world. I think for people in other countries, joining your local safety institute might be one of the best things to do. Especially Australia has just announced they're going to have one too and I think that's really exciting. I think there are other places as well. You're just classic sort of AI think tanks. More and more people are starting to have a bio component and I think that's very important. So yes, for people who have been skeptical of this, well, it looks like the expert community is starting to come around and Say, oh yes, this is really big. I think cltr, I think we were good that we spotted of this as a concern earlier than most. But I'm really glad that other organizations, many of whom you have had on the show before, Rob, are starting to do this. And then I think you shouldn't underestimate also just positions within a government related to chemical and biological defense and related to national security. I think actually if anything, those are particularly rare and precious skill sets that we don't have as much. I mean my be more about people who want to transition out of government. If you've worked in the intelligence community thwarting biological attacks for 20 years, please get in contact. I'd love to talk to you. You're doing great work, but I think there's also even more better work you can do on the margin. But lastly, I talk about fellowships and things. I think I've talked about just organizations in general. I'm really excited that the ERA fellowship, which has run for a number of years on AI safety, both technical safety and governance, I have been supporting them and they are now running an AI bio focused FE fellowship where they're explicitly getting folks, maybe folks who really know all about machine learning, to learn more about biosecurity and apply their knowledge there, or people who particularly know more about biology, but to do more coding and to do more AI safety relevant work. And I think it's this intersection fellowship I'm really excited about. I ask other fellowships consider doing this more. I was on the govai one in 2023 and I was the only person doing bio at the time and that worked out pretty well for my career, maybe just because it was pretty rare back then. So I think this is an undertapped area.
Rob Wiblin
What about the AI companies themselves? I guess you've had some criticisms for them, but presumably they are doing important work and they could be really useful roles there.
Richard Melange
Yeah, I've had criticisms because that's where the interest is of how can you do better. But you're absolutely right that the safety teams, at least some of the frontier labs, do some of the most important work in the world. They're the ones with the models. Yes, I, I am certainly someone who would personally say joining very safety focused teams at the companies is a thing I applaud.
Rob Wiblin
I guess a possible argument against working at them is that they can pay the most and I guess they have the best recruitment teams so they might have the easiest time hiring. So if you're someone who's open to working elsewhere in government, perhaps or in other organizations that just can't pay the kind of equity that Anthropic or OpenAI might, then you should perhaps consider doing those other things because they could be more negotiation expected. You're looking sceptical.
Richard Melange
I'm not sure. I think you're right that they can pay the best. I think a failure mode has sometimes been not necessarily in Bio, but in other areas. The companies will often want to get people with decades of experience who are really senior leaders, which makes sense because they are people who are really legible and can talk to government and engage with the national security community. And that's really good. But I still think sometimes they will. I would, I would urge listeners not to underestimate how far even more junior contributors who really know the area very well can move the needle on some of this stuff.
Rob Wiblin
Yeah. What sort of skills? I mean, obviously, I guess AI expertise is important. Biology or relevant biology expertise is really important. What other sort of skill sets do you think are really needed in this area?
Richard Melange
Yeah, more than anything it is a strong security mindset. I think that is something that is. Has come out in past projects as lacking even sometimes in the community. A real strength of being able to look at a system and going, but how would I break this? Where are the vulnerabilities? Why might this proposed governance idea not be robust to future AI advances? For example, I think curiosity in really keeping up with AI is very important. There are more biosecurity people around in the world. I suppose it's an older field in some sense. The best folks are the ones who are still on the button with respect to AI and they know where the field is at and they are also able to go and where might it be in two years? Can I advertise CLTR now? Yeah, go for it, go for it. So at the time of recording, CLTR is hiring, but that closes pretty soon on the biosecurity team. So that might not be as relevant for listeners at the time. But we often, we rely deeply on contractors. I was a former contractor cltr. James Smith, your mirror biology expert, was a former contractor. I think we have a good track record of contractors going on to cool things and so people who feel they have particular expertise, especially if we can still have a link to the old advert at the time in certain areas. Personally, I'm especially excited about talking to people with deep national security expertise. Maybe people who've, who've connected with the intelligence community. Please reach out. I would be excited to talk to you because we're always looking for great contractors to help us deliver on the mission.
Rob Wiblin
I guess, obviously we will have a whole lot of other roles and organizations that people can consider@jobs.8000hours.org our job board. And I guess we'll link to a bunch of resources in the episode page associated with this interview. Are there any particular links? Is there kind of any other archetypal fellowship or like a landing page for people who are trying to get into
Richard Melange
this at this point? I would say I helped curate the resource page on the ERA AI Bio Fellowship, and so I think that would be a particularly good one. You might see a preponderance of CLTR papers, but I did put a lot of other ones there as well that can give you a sort of overview of the field. I hope that this podcast is also useful for that.
Rob Wiblin
All right, let's try to close out with something a little bit hopeful. I guess you talked about a whole lot of different steps that we could take, all of which I basically agree would be helpful to a greater or lesser extent. But they are all kind of doing damage control in a situation that's quite unpleasant. And none of them is a silver bullet that is really fixing it entirely. So fundamentally, it seems like we're about to go through an era when we've enabled a whole lot of incredibly dangerous stuff. We made it a lot easier, but we don't yet have equivalent technologies that can fully safeguard us and diffuse that risk and actually make us even safer than we we were before. What might be the technologies that we could invent that would get us out of this intermediate, like, very dangerous window into a world where we say biological catastrophe risk was kind of a thing of the past? We've largely diffused the bomb. We feel pretty good about things now.
Richard Melange
That is a great question. I'm glad that we are ending on optimism. The first thing is to say is we have eradicated diseases. We don't have smallpox anymore, except in two heavily guarded facilities, one in the us, one in Russia. Because we were able to vaccinate everyone so much that we eradicated the virus, I think we can totally try and do that for other viruses and other pathogens. The second is I am very excited about built environment modifications. We can get to a world where pathogens just can't exist in the air and huge swathes of where humans live and breathe and work are pathogen free. And we should be going to that. Third will be eventually. I think we can have much more personalised. There wouldn't even be countermeasure more Just like measure personalized therapeutics and vaccines and antivirals where you have local capacity. The moment something is detected by metastronomic surveillance around the world, immediately that feeds into an AI system that generates a countermeasure automatically that gets built, then that design is shared in a decentralized way to manufacturing capability everywhere. And just every day locally you wake up and then some machine spits out your vial and you just eat a tablet or something. You can get really sci fi here, but there are worlds where really we just are constantly aware of all the pathogens in the world. We are constantly defending against them. We're not allowing them in any buildings. We've eradicated loads of them. And those that do get through, through, we are responding instantly and simultaneously, everyone across the world.
Rob Wiblin
Yeah, so that does sound pretty sci fi. I suppose it requires us to be able to, I guess, manufacture proteins at will locally to be able to tell what would be an effective countermeasure. To make sure, basically using just simulations on a computer to make sure that it's not going to cause any safety damage, it's not going to damage the body some way or another. But those things probably are all going to come sooner or later. This is not just around the corner. But none of those seems like a good, completely insurmountable challenge in the fullness of time. And I guess we probably wouldn't just take a pill every morning of new drugs disinvented overnight unless we really needed to. But I guess if this was the difference between life and death, because these incredibly dangerous capabilities were disseminated so widely that it was really the only way to diffuse them, then I think probably we would accept that as a cost of living. How long do you think we might have to wait for some of these things? We might really think that AI is going to do a lot of the heavy lifting in getting us to that point, because we would expect AI to be doing a lot of science just in general in the 30s and 40s and 50s.
Richard Melange
Yes, entirely. You said sooner or later, and I wanted to say it may be sooner. And that's because of very advanced AI scientific capability. I don't think we're going to get anything like that sci fi world in the next 20 years without AI. But I am totally open that within 20 years we have most of the vision I just described described much of it accelerated by AI systems.
Rob Wiblin
Cool. Yeah, Well, I mean, I think that is a legitimately positive vision that we may not have to endure what feels like a very precarious situation. For the entire rest of our lives. I guess we've got a lot of work to do to survive for the immediate term and then get to that longer term, safer future. My guest today has been Richard Melange. Thanks so much for coming on the 80,000 Hours podcast, Richard.
Richard Melange
Thank you so much, Rob. This has been really great. I've enjoyed it so much.
Guest: Dr. Richard Melange, CLTR
Hosts: Rob Wiblin
Date: March 31, 2026
In this thought-provoking episode, the 80,000 Hours team, led by Rob Wiblin, sits down with Dr. Richard Melange, the AI Biosecurity Policy Manager at the Centre for Long Term Resilience (CLTR). The discussion dives deep into the rapidly advancing intersection of artificial intelligence and biology—specifically, how new AI models are already making it possible to design potent organisms and biochemical agents with capabilities beyond anything found in nature. The conversation covers recent AI-enabled experimental breakthroughs, the shifting landscape of biothreats, evaluation of current risk mitigations, and the promise and challenge of defensive biotech innovation.
Listeners are taken on a tour of cutting-edge empirical results, the real limits (and myths) about barriers to bioweapon misuse, how regulation and model access controls are being approached, and how society might differentially accelerate defenses to stay one step ahead of new existential risks.
Evo 2 Model: Step Change in AI Biosecurity Intersection ([01:26])
Implications:
We can now design small organisms that do things better than nature ever has—a preview of coming capabilities for more complex forms of life and potential misuse scenarios.
The rapid advances in AI-driven genome engineering and biology present not just transformative opportunities for science and medicine, but open up existential security challenges. Tacit knowledge barriers have fallen, screening regimes are showing cracks, and powerful bio-tools are largely open, with little consensus on responsible restriction or managed dissemination.
Urgent priorities identified:
Richard Melange’s ultimate message:
We can, and must, get ahead of “offense-dominant” AI-enabled biothreats by stacking smart, multi-layered defenses. There is optimism in the vision of a resilient future—but only if action keeps pace with capability.
[Podcast language and tone preserved where possible. Coverage skips intros, ads, and general housekeeping. For extended reading and updates, see episode show notes and listed resources.]