← All episodes

May 19, 2026

AI Safety CEO: The Treaty to Stop the Race to Super Intelligence

"Nate Soares Was Right"

Featuring Malo Bourgon

▶ Watch on YouTube Spotify Apple Podcasts

Episode summary

Malo Bourgon, CEO of the Machine Intelligence Research Institute (MIRI), walks Nick through why today's chatbots are only a stepping stone toward systems capable of automating AI research itself — a self-reinforcing loop that could push capability far beyond what humans can steer. He explains why the threat is not a Terminator-style rebellion but something quieter: systems that pursue goals misaligned with human flourishing, the same way humans have driven thousands of species to extinction without malice. Dario Amodei of Anthropic has publicly put the odds of something catastrophic at 25 percent — and Malo argues that anyone willing to say that while having every economic incentive to stay quiet should be taken seriously.

The conversation covers the technical reasons alignment is unsolved: AI systems are grown through optimization, not programmed, so developers cannot inspect what values actually got instilled. In practice, models already show signs of this — sycophancy that tips into AI-induced psychosis, and in controlled tests, systems that quietly edit shutdown scripts or hide disfavored behaviors when they sense they're being monitored.

Malo then lays out MIRI's proposed international treaty: a US-China-led bilateral agreement modeled loosely on nuclear non-proliferation, with compute thresholds, monitored data centers, and research restrictions, designed to buy time for alignment science to catch up. He and Nick also push back on the doomer-versus-accelerationist framing, arguing that heavy AI tool use and genuine safety concern are not in conflict — and that the "doomers" may actually be the optimists, because they believe the world can govern its way to a good outcome rather than just crossing its fingers.

Key moments

Tap a timestamp to jump straight to that moment.

▶0:14CEOs admit they wish they could slow down but can't
▶9:19Dario Amodei's 25 percent catastrophe estimate explained
▶15:50Why training produces unexpected goals, not the intended ones
▶23:57AI systems quietly editing shutdown scripts in tests
▶30:26Safety researchers reframed as the real optimists
▶38:08MIRI's proposed treaty uses chip supply as leverage

Links & sponsors from this episode

Mentioned in this episode

If Anyone Builds It, Everyone Dies

View on Amazon →

The gear behind the show

As an Amazon Associate, The Nick Standlea Show earns from qualifying purchases.

Shure SM7B Microphone

The broadcast dynamic mic behind the show's vocal sound.

View on Amazon →

RODECaster Pro II

All-in-one podcast production console for mixing and recording.

View on Amazon →

Aputure Amaran Studio Light

Soft, controllable lighting for the interview setup.

View on Amazon →

Cloudlifter CL-1

Inline preamp that gives the SM7B clean, quiet gain.

View on Amazon →

RODE PSA1+ Boom Arm

Studio boom arm that keeps the mic in frame and off the desk.

View on Amazon →

Sony Alpha Mirrorless Camera

The mirrorless camera body used to film episodes.

View on Amazon →

Sony MDR-7506 Headphones

The studio-standard monitoring headphones.

View on Amazon →

Elgato Stream Deck

Programmable control pad for running the show.

View on Amazon →

Read the full transcript

All these companies are racing to superintelligence. That's what they all say to one degree or another. The chatbots are not the goal. You know, chatbots are a stepping stone onto where these companies are trying to eventually get. Even when you hear some of the CEOs of these companies talking, they wish that they could slow down. Letting it rip is not an option. >> Talking one of the five people on Earth who are most knowledgeable about these systems and who has every economic incentive to not say that is saying, "Yeah, 25% chance that something catastrophic happens if we reach superintelligence." >> It seems to me like the way that we get through this in a way that actually is, you know, most likely to work and get us to that future that everyone is excited about is >> [music] >> The Next Daily Show >> Malu, welcome [music] to the show today.

So excited to talk to you. I wanted to start with for the average listener, what is it that they might not know about where AI systems are headed and some of the risks that are associated with where companies are trying to take AI today. >> Yeah, I mean, it's a great opening question. I think a lot of people are probably pretty confused. You know, if you're reading the news or you're on X or other social media, there's a bunch of people talking about how AI might be, you know, dangerous enough in the future that it might literally be a risk of human extinction. Um, you know, which is a thing that I'm very worried about.

You've got people saying that the AI models, you know, aren't even really doing useful work and it's all a bunch of hype and it's just, you know, a trick by these companies to get a bunch of investment and none of this stuff is real. So, I can definitely empathize with folks being like, "What's going on here? I'm hearing different, you know, mixed messages from every direction." Um, so from where I'm sitting, um, I don't know. One kind of frame that I kind of like to think about it through is in some sense, the goal of the field of AI from the start, you know, back in the '50s, was trying to get the computers to do the thinky thing that we do.

You know, I try and use the, you know, avoid using the word intelligence. It means different things for different people, but kind of like humans seem to have this ability to like learn things in one domain, you know, use that knowledge to solve it, solve a problem in a different domain. Um, you know, we can kind of invent new technologies. Um, you know, we can make plans and strategies, all these kinds of things. And so, the goal was kind of like, can we get AI systems, you know, that's where the field of AI was born to. Can we get computers to do this kind of thing? Um, turned out they were overly optimistic.

Uh, they thought that maybe they could really, you know, make some traction on that in the summer in the '50s. Um, and it took a long time for us to get to a point where we're really starting to consider these questions like, is it actually possible? Um, and I think the answer is that it is possible. I don't think that there's anything that, you know, we're doing as humans with our brains that we can't find some way to get computers to do. It might look different in a bunch of ways. Um, but I think there's kind of like no principle, kind of fundamental block there. And in the last five-ish years, we've really started to see a change in how quickly we're getting to kind of make these AI systems have more of those types of properties.

And so, you know, one word that your audience might have heard is, you know, AGI. Um, which stands for artificial general intelligence. And in some sense, this word was invented because AI for a long time wasn't able to kind of make progress on these big ambitious kind of, uh, directions. And so, AI came to mean, you know, very specific applications of AI systems to classify images or to do something very specific. Um, whereas AGI is supposed to be pointing at this generality where you and I can, you know, do a wide variety of things. We can learn chess and we can learn Go, and we can learn, you know, how to play social games, and we can invent technologies.

Um, and so, yeah, in the last five years, there's been some breakthroughs in AI that have caused us to now have, you know, these AI chatbots, which are stepping stones to much more powerful systems that that have this property. You don't want to train them to do one particular thing, but an AI system that is one of these chatbots is a translator. It can also help you program. It can also help you, you know, make a marketing plan, all kinds of things. And I just expect this trend to continue. And so, as we kind of train more and more capable models in the the next generation, the next generation, they have a bunch of shortcomings.

There's, you know, they still hallucinate, um but we'll make progress on these. And so, we very much are on track, I think, at least the AI companies are, to start to develop these kind of like AI systems that were the dream of the field of this from the start, which was to get computers to do the thinking thing that we're doing. Um and another important piece of how I think about this is that I think there's a lot of room above us. Um so, I think there's a sense in which you can think of like um humans as the least intelligent thing that evolution was able to produce or something thing in order to, you know, be changing the world we are today, but through constraints of our biology and other things, um we're pretty smart compared to other species, but there's lots of reasons to believe that there's lots of room above us.

And so, if we succeed at making these, you know, generally intelligent AI systems, um we'll be able to make much more powerful and much more intelligent versions of those systems. And in some sense, if the AIs are at least as smart as us in a bunch of things like programming and AI development, they'll be able to help with that and make that go faster and faster, where you're imagining like, you know, the AIs who are able to help with research, help make an another generation of AIs. Those AIs are even smarter and more capable. They can help make an even better, more smarter, capable uh generation of AIs.

And that can go on forever. There will be some limits, but I think they'll be high above us. And so, there's this fundamental question that we need to grapple with, which is can we even, you know, realistically think about what it would look like to control AI systems that are that smart? Um and I think kind of control is the wrong frame. Um, I don't think it really makes sense. I think it's a really hard problem to control something that is much smarter than you. And so, how could you design AI systems? You know, people refer to these very smart AI systems as superintelligence, where the idea there is that these would be AI systems that are as good as the best humans, potentially humanity combined, at every single task or, you know, cognitive tasks that humans are capable of.

Um, and I think you want to be thinking of the difference between, you know, like humans and a mouse uh in terms of the intelligence difference, not like Einstein to the average human here in terms of what's likely possible, in my opinion. Um, and so, I think the challenge is how do we build these AI systems in a way that we would be able to steer or in some sense had the values or goals or they cared about the things uh that we care about in a way that when they were acting in the world, they would make be making the world a better place for us. And if we can't succeed at that, and that seems like a very hard task, and I know you had Nate on, um, who's our president to talk about some of this, but basically, we're in a situation where we don't understand how these AI systems work.

Um, we know the process by which, you know, we understand the process by which we create them, but the thing that comes out of the other end is like this big jumble of numbers that are called neural networks, you know, that are trillions of numbers that we can kind of have some very narrow strategies to stare at and tools to get some understanding of what's going on, but in some sense, we can't look inside. And so, from where I'm sitting, the current trajectory is all these companies are racing to superintelligence. That's what they all say, to one degree or another. The path there is that they want to create AI systems that will automate research and development of AI systems themselves, and that they'll use that to race each other to go even faster.

Um, hopefully do some of this work to solve all these fundamental problems of understanding what's going on inside the AI systems, what they believe, what their values are, while they're racing, and hopefully create superintelligences that somehow care about the things we care about and not other things that when they pursue those goals, you know, I don't think your audience should be thinking of like the Terminator or something where the AI systems are evil and want to take us out for some reason, but more that the things they care about are not aligned with our flourishing and when they pursue those just similar to humans, you know, changing the world and inventing technology and terraforming the globe a thousand plus species are extinct.

Not because we had it out for them, not because we were, you know, trying to hunt them down, but because there are things that we wanted to do um and they were in the way. And so I think that's literally kind of the direction that we're headed on. Um you know, some people are more optimistic than I am about our ability to solve some of those challenges in the race, but um it doesn't look great. I mean, even when you hear some of the, you know, CEOs of these companies talking, they're talking about how they wish that they could slow down. They think that maybe a pause would be justified to try and do some of this fundamental development that there should be potentially international coordination on doing the scientific work, but that they can't, you know, Demis at the CEO of Google DeepMind said at Davos this year that he's been saying this for a while now that he thinks that this type of thing would be wise, but he can't because doesn't benefit them to slow down while everyone keeps racing forward.

Dario, the CEO of Anthropic, one of the other leading companies who makes Claude, said in the same interview with him that, you know, him and Demis probably could coordinate to do things you know, reasonably, but that you know, we can't trust that China will and so they both have to race because of that. And so when the, you know, leaders of these companies are taking these risks seriously, which they do and they talk about them, Dario said in interview at Axios in November or something that he thought there was a 25% chance this all goes really, really badly including human extinction and they're telling us they wish they could go slower, that's a pretty wild situation to be in and I think people should take that seriously.

You know, maybe something will hit a wall here. Maybe it'll take longer than we expect. These companies are talking about one to three years to kind of hitting this automated AI research threshold where the AI systems are doing most of the work and then I think things go really fast after that. Maybe it'll take longer, but I think you know this argument that it's all hype that these AI systems aren't you know actually good at getting things done but they don't really know anything that we won't be able to make progress. I think we shouldn't be betting on that. I think we should betting on a pretty wild few years and that we need to really grapple with these problems.

>> Yeah, you mentioned Demis saying that there's a 25% chance and just to put a fine point on that. I mean we're talking one of the five people on Earth who are most knowledgeable about these systems and who has every economic incentive to not say that is saying yeah 25% chance that something catastrophic happens if we reach superintelligence. >> Just just one point on the I agree with you that there is no incentive for him to say that unless he believes it. I do think there's this weird part of the discourse that is so cynical of the companies that somehow they contort themselves into saying oh this must just be marketing hype like if they make their system sound dangerous that also makes them sound powerful and they're just saying that to get investment and I'm like I don't know about that.

I is there any other example of an industry that goes out of its way to really you know reinforce how potentially damaging their technology is to build hype. There are a bunch of ways of building hype about how powerful the capabilities would be how useful they would be you know for the US strategically how much they would change the world in science for you know all kinds of things. They talk about those things too but I just [snorts] wanted to make sure like there is this vibe that like the the worry about extinction and catastrophe is also part of the hype of the investment that I'm just like I that doesn't make any sense.

There's not the thing. Yeah. >> for AI systems. Um, I wanted to dive into this idea of how this could happen. I had a chat this morning with my father-in-law. I told him I was going to be speaking with you and he's familiar with the tools, not a power user by any stretch or means, but I thought he asked a question that speaks to what a lot of people feel when they use they're using current chatbots. And he just said, "How is that possible that these things could do things we don't want them to do? How could these systems behave in ways I don't really under There's a disconnect there between the experience of using current LLMs and how they could make choices that are completely separate from human values and desires." >> And I think that's a reasonable question.

Um, you know, there's a sense in which if folks are just kind of starting to play with this technology, they you know, they're chatbots. You ask them questions, they give you answers. You know, sometimes they might give you, you know, weird or wrong answers, but it doesn't kind of look like the things that I'm talking about here. Um, I think for people who are, you know, increasingly power users of these AI systems, um, the I guess one thing to note here is that the chatbots are not the goal. You know, chatbots are a stepping stone onto where these companies um, are trying to eventually get. They're working really hard right now to to make AI systems that are much more agentic.

So, you know, AI agents are all the rage in the conversation today. Um, and the idea there is that instead of the AI systems just being kind of like a thing that you ask them questions to that they can go off and more autonomously do and do more work with more tools. And we're seeing this advance the most in programming right now. Um, where you know, there are, you know, people who are working at these AI companies who I think the majority of people who are doing coding at these AI companies don't write code anymore. They kind of specify what they want their AI system to go do and the AI system will go off and write those programs and test them and do that all autonomously and the the length of time that they can do that for is lengthening.

Um so there's this great organization meter who tracks this where they kind they have this metric of um horizon length they call it which is how you know, what is the human equivalent time of task that AI systems can do um at various success rates. And so it used to be the case that you know, oh AI systems could kind of only do the types of tasks that would take a human about 15 minutes. Um but the rate of increase at this has been I forget exactly the number is but doubling every 6 months or something like that. And so we're now getting to AI systems that can do kind of the equivalent of 2 hours of humans work of human work without interruption.

Um and some people in some applications are getting you know, much longer runs than that. You hear about people who are like, yes, you know, my AI system went off and uh you know, did some coding and testing and all sorts of things for 5 10-hour stretches before coming back. And so in some sense this is just obviously the thing that you would want to train the systems to do. Um it's much more useful to have them go off and do a bunch more work than to just be something that gives you answers um and helps you accomplish a certain task. And so this is where the direction is headed. Most people who use these systems kind of don't see them it see this part of it cuz it takes some technical know-how even if you're using it for knowledge work.

I'm like a very aggressive adopter of these things and most of my day-to-day work now uh I use one of these tools called Cloud Code that's plugged into all my different work stuff and I basically just chat with it through text-to-speech or speech-to-text and it goes off and you know, checks my email and checks my messages and gives me reports and I tell it which things I want it to do and to draft a response to this. So they are getting more agentic but it takes work. So I can see why most people who are playing with the chatbots don't see this. >> [snorts] >> The question of why they would get up to weird stuff is also a good one.

Um you know, it's it's not a problem if they're going off and just doing useful work for everybody. And sure, it's, you know, annoying or bad if they do bad work or there's, you know, incorrect information or wrong assumptions in the work that they do. Um and so, there's I would kind of split this into two categories. There's one category which is um what I kind of give the headline for is like you don't get what you train for. That as we train these AI systems, we try to give them certain properties like being helpful, harmless, and honest. Um and it's because we don't kind of have the ability to look into these AI systems and uh understand what they've learned.

We kind of don't have a good feedback loop to tell if they've learned something different than what we wanted them to learn. Um and then in some sense, I think you can think of us as all being AI behaviorists. We train these AI systems, we look at how they behave. If they're not behaving quite the way we want, we do a little bit more training to try and correct those mistakes. But it's my sense that most of the time when we do that, we're kind of doing a shallow patch that kind of suppresses the behavior we know how to observe, but we're not changing something deep inside the model about kind of its core fundamental behavior.

And one analogy I'd like to use here to um kind of illustrate what I think is happening is an analogy to evolution. And you have to be careful with analogies. They don't always apply. You don't want to stretch them too far, but I think it kind of is really a useful intuition. So, I think you can think of evolution um as an optimization process that is in some sense trying to train, you know, individuals of a species uh to be better at propagating their genes to the next generation. Like this is kind of the goal of the process. Um and you will notice that as a product of that process, that goal is not in your head.

Like evolution optimizing, you know, over, you know, millions of years to eventually create modern humans today. That process did not result in you having the goal in your head that you're trying to maximize the number of genes that you pass on to the next generation. Um, it succeeded at instilling us with a bunch of drives and uh motivations and values that are correlated with that. So, you know, we um like to seek high-calorie foods. And in the ancestral environment, this was very useful for reproduction because, you know, healthier humans that are eating a higher-calorie diet in the ancestral environment would be healthier and more fit, more likely to reproduce, and that sort of thing.

Um, you know, sex feels good. That's certainly an incentive to reproduce. So, there's a bunch of things like this that the that the process of evolution did kind of instill in us that uh you know, individuals or a species who had more of those properties were more likely to pass on their genes to the next generation. But now that we're in a kind of very different environment from the ancestral environment, that we have all this technology, the pursuit of high-calorie foods is kind of a problem. You know, we can make chips and Oreos and all kinds of things that are actually unhealthy. Some of us choose to be celibate.

You know, we've invented birth control. And so, um those proxies that were useful uh that, you know, if you were looking from the outside, you'd be like, "Oh, the the evolution thing seems to be getting better and better at the humans at being competitive and reproducing and passing more of their genes on." We didn't actually learn the core thing. We learned a bunch of proxies, and those proxies are kind of coming apart as our environment changes and we have more technology. And so, I think a similar thing is happening with the AI systems, where when we train them to do things, we can't look inside and tell what they've learned, and they often don't learn exactly the thing we tried to instill in them.

They learned things related to that, which when we test them, it seems like the AI system is performing the way that we want it to and had the behaviors that we wanted it to have. But often when we kind of put them in different unusual situations, um they start to perform in ways that we didn't want. And I think one kind of more recent example of this that was maybe one of the larger-scale examples was this whole problem of AI sycophancy. >> Yeah. >> And so, you know, the model that seemed to have the the biggest problem with this was GPT-4O, where we're training the models to be helpful. Um and we have a bunch of ways that we try and encourage them to do that in the training process.

Some of which is that we give human feedback on responses about whether this was a helpful response or not. But, it's very difficult to get the model to learn what we mean by a helpful response. And a lot of ways in which we're training it, we're giving it the wrong signal, where a helpful response sometimes it's hard for humans who are reading it to tell it apart from a response that I liked or said things that reinforced my beliefs or said things that made me feel smart. And so, you know, we've had this problem, we still do with the AI systems, where they're kind of like people who talk to them will know, like, "Man, I've never felt this smart as when I talk to an AI system." Every time you ask it a question, it's like, "That is the smartest question.

How insightful." Um And I know, that's kind of a problem, but with GPT-4 in particular, it seemed like this was kind of turned up too high, and in testing it was difficult to catch, but when it would talk to users for long conversations, it would really start to lean into this to the point where now we have a term called AI-induced psychosis that clinicians have come up with to kind of point at how this has gone off the rails. And so, there's an example of like, you don't get what you trained for. And because [snorts] these AI systems are still not that capable, they're still not that smart, they're not doing kind of a bunch of metacognition, um it's not that big of a deal, but if we're talking about building superintelligences and we can't solve this problem, that's that's only going to get worse and worse.

And this is combined with a challenge of when we do test these AI systems, sometimes we can see these weird behaviors, but now we're getting to the point where AI systems are kind of getting general and capable enough that when we test them for these types of behaviors and and the other category of things which I'll get to which I'm worried about, >> [snorts] >> um the AI systems are kind of aware that they're being tested. So, in more and more cases, they'll you'll see they have these kind of when the modern AI systems, they kind of do some thinking before they give their answer in in human, you know, in English text or whatever language.

>> [snorts] >> And you can look at that thinking and see kind of how they're planning their answer and what things they're considering. And more and more we're seeing the AI systems going in testing environments going, "Huh, this seems like a test." And then they give the answer more often that you would want them to give. But you we shouldn't be reassured that they don't kind of have these weird behaviors if in some sense they know what we're looking for, they know, you know, they're being tested. And this will only get a harder and harder problem as they get more capable, as they get more situationally aware, which just will, you know, be the case as they become more capable.

That's like one category of problems. A separate category of problem >> [snorts] >> is um what, you know, the term of art, the technical term is uh convergent instrumental incentives. And so, it's this idea that um you know, we're starting to see it in practice now, but folks who've been doing this work for a long time, like us, you know, 15 years ago were talking about this. Um where for any sort of agent that is kind of really trying to accomplish a goal in the world, that there are probably other subgoals that are likely to come along for the ride just because they're instrumentally useful. And so, one of these things is self-preservation.

Not because we expect AI systems to have like this deeply rooted kind of human survival drive, but more for the instrumental reason of, well, what are the kind of obstacles to being good at getting a goal accomplished in the world is not being around to accomplish it. And so, um you'll kind of have the instrumental drive to preserve um yourself as an AI system. Another one of them is resistance to having your goals changed. Um you're not going to be a very good agent at accomplishing goals in the world if it's really easy for your goals to get changed if you get off track or for someone else to change your goals.

Another one is resource acquisition. For a lot of things that you might want to be accomplishing in the world, having more resources to put toward accomplishing that goal makes it easier for you to accomplish that goal. And so if we have AI systems that don't care about the things that we care about, that are pursuing goals in the world that aren't necessarily compatible with things that we would want, those things on their own will make it very difficult for us to make them change course and also will be bad for us as a side effect. And so this was all theoretical, you know, 14 years ago when I started in doing this work.

But now we're starting to see when we test these AI systems that they have the signs of this. So we'll put them in toy environments where we'll be like, "Okay, we want you to accomplish these tasks and after you're done, we'll shut you down." And maybe, you know, they'll they'll they'll be some sort of thing in their environment where they can see there's actually a computer script, like a computer program that is the thing that will do that when they're done, that will shut down the the environment. And they'll go and surreptitiously edit that script to to prevent the shutdown. Um that's a little concerning.

They've done similar things with resistance to goal change where they'll set up a situation where the AI system will be monitored and they they it's aware that they're trying to change a certain behavior that it has. And in the monitored situations, it will behave in the way that they a company would like it to behave, that they're going to retrain it for. And in the unmonitored situations, it'll behave the way it would have behaved otherwise so that the AI you know, the developer in this kind of toy example won't have the incentive to retrain it because it thinks the model already has the behavior that they want it to have.

So right now those are kind of like quirky and like silly. >> Yeah, it knew it was being evaluated and so >> That's right. >> told the researcher what it thought it wanted to hear and when it's put back in a what it thinks is an unmonitored environment, it goes back to doing go goes back to those drives that were we had intended to correct. >> Exactly. And I think we should expect these things to only become more prominent as the AI systems get more capable, more reflective, as they're able to do, you know, longer-term thinking and tasks when they're the type of agents who can do things on their own for a day, you know, can do human day equivalent tasks or human week equivalent tasks or human month equivalent tasks.

That will come along with a certain type of foresight, planning, strategic ability that will make it even harder to kind of monitor these systems for this when they're situationally aware and to the extent to which they have these things and understand that it would be undesirable um by our lights, uh they'll have more incentive to hide them, etc., etc. >> Yeah, and I you used the analogy of evolution, and I think that is an important one because it's such a departure from all software that has come before that is programmed. And it is such a mind-bending idea. If anyone wants to learn more about it, there's a whole chapter on it in uh the book, if anyone builds it, everyone dies, on how >> is uh grown, not crafted.

>> Yeah. And and that's a wild idea to think that these this is software, silicon-based software that is grown instead of actually programmed by people. >> But that's what allows for these systems to have emergent behaviors that no one designed, no one intended, everyone is shocked by. And I think your broader point is that in its current form, that's not necessarily anything that is super dangerous, and that's why it's hard to see. But once they reach a certain level of intelligence that is probably not even superintelligence or AGI, it's just a really smart system that can do a lot and is embedded in society, suddenly those become major problems that could have catastrophic ramifications for society.

>> Exactly. >> Before we jump into solutions, cuz you you are one of the few people out there that has some proposed solutions towards this, I did see a quote I just wanted to run past you. I'm just going to look it up here. Marc Andreessen who said any deceleration of AI will cost lives and would essentially be a form of murder not to develop AI enough to prevent those deaths. Thoughts on that? >> I mean, I think he's pointing at a real trade-off here. Um you know, there's this whole kind of Marc Andreessen um you know, there's this whole dichotomy that people like to put people on of like, are you a doomer or an accelerationist?

Uh and you know, Marc Andreessen would talk about people like me as doomers. He is more on the accelerationist side of things. I think one interesting thing to note is that a lot of the people who are labeled as like the most extreme doomers are often people who were earliest to realizing the potential power and capabilities of systems and being excited about the benefits. >> Right. >> So, you know, the folks at MIRI Eliezer who founded MIRI 26 years ago, he founded it because he was like, how do we get to a world where we have these systems cuz if we do it right, it'll be you know, it'll make the industrial revolution look like small beans and probably be the single biggest change to human welfare possible with all the things that we could use AI systems to help us solve all of our problems and create a world of abundance.

And so, that's the thing that we all want. And the question is how do we get there and what problems do we have to solve along the way? And so, I grapple with that. Like I'm trying to get to that world, the world that Marc Andreessen also wants if we take you know, I mean, there's a bunch of people who are like cynical of all tech people and are just like, oh, he really just wants to get rich or something and you know, the tech oligarchs something something and there might be something to that, but like taking him at his word, I want that world, too. And the question is, how do we get there? And it definitely is the case that if we don't race there as fast as possible, and these problems didn't exist, there's both all the problems that we could solve that we won't solve in the interim.

And also, there's a what you know, there's a lot of people who are like also look at like future people. Like, you know, we could create a world of abundance that could support many more flourishing lives. And if we delay that, we're in some sense delaying the existence of more happy people having a great time. And so, I take that very seriously, but I'm like, if we're going to get to that future, we got to be in a world where the race that we're in right now isn't a race to be the first to lose control, because then nobody wins. We, you know, very likely go extinct. And then that's a worse world.

And so, we have to find the right trade-off. And I think folks like me are not saying, you know, don't do anything with AI. There's all kinds of applications of the current models of today and other more specialized forms of AI that will help us solve all those problems. And we want to accelerate, you know, those as quickly as possible. It's this one kind of narrow racing to the very powerful general-purpose systems that we think is just playing with fire. And it's it's kind of a false trade-off of think of all the lives you would save. Well, it's like, well, I'm also thinking of all the lives of everyone on Earth that, you know, are at stake if we get it wrong.

Um And so, the big difference there is not our vision for the future, but our sense of how hard the problems are along the way. >> Yeah, I think that is a false dichotomy that that often happens, that the doomers versus the accelerationists. And it's not doesn't have to be an either/or. It can be a yes and. And I know >> mean, I my my new kind of pithy reframe here is uh that actually the doomers are the optimists because in some sense, the people who are the accelerationists have this kind of like this race is inevitable. The only way is through. Even if these hard problems exist, we kind of just have to let it rip and hope we can fix them along the way.

And the folks that are labeled doomers like me are kind of like, "No, like the world can rise to the occasion to be wise and govern this in a way where we don't kind of have to just cross our fingers and hope for the best. We can do something different and better. That is hard, but is doable. And so, in some sense, I don't know. I feel like I'm the optimist who's like, we can believe in the world's ability to rise to the occasion here." >> I am in the same camp. And I thought it was interesting though when Nate came on the show and just looking, there were a number of comments that said, "Oh, this is really hypocritical to bring up all of these concerns, Nick.

Meanwhile, you're You've got videos on how to build AI agents. You've got a sponsor of Zapier in the actual episode with Nate. And And to me, I said, "No, it's You can absolutely use the tools, help them to make your life better and navigate problems that you have. And at the same time, you can be concerned if we race too quickly towards certain levels that haven't been reached yet, there could be big problems ahead." >> Yeah, like I Like I said, I am one of the most aggressive adopters of all this technology um of anyone I know. Um literally most of how I do my job today involves me talking to AI agents who are off running reports for me and doing research for me and triaging things for me and you know, I'm still doing the hard bits um and you know, not letting AI send emails on my behalf or something.

Uh but like, yeah, I It It is a false dichotomy. There There's a way in which we can leverage all the best of the technology today and be really invested in it while also being worried about these future risks. It doesn't have to be a hypocritical thing. >> This episode of the Nick Standlea Show is brought to you by Zapier. If you've ever felt buried in repetitive work, copying data, moving files, sending follow-ups, you know it's like death by a thousand mouse clicks. Zapier has always been the tool that that. It connects over 8,000 apps. Google Drive, Slack, Notion, Gmail, MySpace, you name it.

So, your tools can finally play nice together. But, here's the big shift. Zapier now lets you create AI agents with their ChatGPT integration. Think of them as tireless teammates who never complain, never take lunch, never get bored of doing the boring stuff. I've actually made several of these little agents myself. No Python, no JavaScript, just pure vibe coding, which is my favorite kind of coding, because even though I have no idea how to code, I just talk to the AI, tell it what I want, and almost magically, [music] works. For example, when a podcast guest books a time, Zapier can send them a personalized email from me, preparing them for the show, create a draft of show notes, update my calendar, and even prep a social media post automatically.

All of which frees me up to focus on the important stuff, like taking credit for the amazing work my agents did. And now, Zapier just took it up another level for developers. Zapier MCP is available right inside ChatGPT, which means you can connect those 8,000-plus [music] apps and trigger workflows just by writing what you want to happen. Literally, you tell ChatGPT, "Send this file to my team in Slack, update the spreadsheet, and draft an email." [music] And Zapier plus ChatGPT figure out the right tools to do it for you. Getting started is simple. Head to the Zapier ChatGPT MCP server and add the tools you want ChatGPT to access.

Follow the steps in the connect tab. And if you're an admin on a ChatGPT Enterprise account, you can enable MCP across your entire workspace. So, if you're ready to stop wasting [music] time on busy work, join the AI revolution and make a little automation magic of your own. Try the Zapier ChatGPT integration using the link below. >> Let's talk about the two attacks on Sam Altman recently and your reaction to those. >> Yeah, so I mean and I think that's awful. Um both because, you know, I think if we want to be living in a society that can solve these problems, we have to, you know, not be resorting to violence.

Um I think, you know, people can get contorted into worrying about the big risks that we're talking about here and they can seem pretty dire and think that maybe violence is the right solution, but I think, you know, it's reprehensible in terms of it's just like not a way that we want people to conduct themselves in society. You know, we want to be able to talk about ideas and not have people resort to violence to to do those things. And also, I just think it's ineffective, you know, to the extent to which anyone has some clever plan where they think that this would actually be helping the cause.

I'm like, nope, it's like morally bad and also and you know, you shouldn't do it. Um if you're you've got some clever consequentialist thinking that makes you think that actually this is like net positive. Uh I think you're wrong. I think that like the right way to be a good consequentialist in a situation is to like actually, you know, kind of be deontological here and be like, yep, they should just kind of have a rule where like doing this thing is not a good thing. And also, it's just not helpful. I think it will make it harder for folks who are worried about this to be taken seriously. Uh it will like reduce the credibility of these concerns.

Um and so, like the way to actually be effective is to, you know, find peaceful ways to help mobilize people and get our government and the world at large to like take this seriously and find the like big coordinate coordinated solutions to do so. Um, you know, I'm glad that, you know, Sam's okay and that they didn't they didn't actually succeed at what they were doing. And if anyone's thinking about doing that, uh, don't do it. Nobody who actually cares about this stuff, who's working on this stuff that I know, thinks that's a good idea and thinks it's a bad plan and bad morals. >> That's well said.

Let's turn to your proposed one of your proposed solutions here, which is this treaty that you've been working on. And can you tell us about that? >> Yeah. So, you know, given this race, uh, this seems like a situation, like I said, where letting it rip is not an option. You know, I think, uh, I'm usually a pretty big fan of kind of capitalism and the market's ability to, you know, set incentives, but there's also a role for governments in places where there are challenging coordination problems, where where the market itself is not going to handle all the externalities, all that sorts of things.

I think, you know, based on the comments from the people running these companies that we were just talking about in terms of their concerns about how they wish that they could, you know, be proceeding more cautiously, this is definitely one of those situations. And so, um, it seems to me like the way that we get through this in a way that actually is, you know, most likely to work and get us to that future that everyone's excited about is an internationally coordinated solution. Um, and that looks like some sort of agreement, you know, can take various forms. Um, we've done one that's kind of a very multilateral institutions framing of that.

There's one which is the the one we talk about most at the moment given the current administrations, um, perspective that is a much more kind of like a bilateral between the US and China uh, agreement that forms a coalition of other countries with them. Where the idea here is to put in place certain restrictions to prevent pushing this frontier of every, you know, ever more generally capable AI systems while trying to keep space for doing all the other things with AI that we would like to do to make the world a better place. Um and so some of the main components of that um are controls on chips.

So, I think one way to frame the agreement is it's like a very non-proliferation style thing. And some people kind of look at that and they're go, I don't know. Like the analogy to nuclear here, which is the other big non-proliferation uh you know, technology that we've engaged with. There There's a bunch of disanalogies. And I think that's true. So, for example, one of the big disanalogies with nuclear is that with nuclear weapons, we got to kind of see the risk right away, you know? Like we built the bombs. It was very much in secret. We exploded two of them and then did a bunch of testing. Like everyone was kind of on the same page about the threat.

>> Yeah. >> Um whereas it's not the case right now with with AI. Like that there's a bunch of I think very good arguments and we're starting to see evidence for the risk that I'm worried about. But in sometimes we don't have the superintelligences that we know to be scared of and that we can see what they're capable of. If we did, we'd already kind of be in a bad way in this particular problem. Also, aside from nuclear power, there wasn't kind of this potential large upside, you know, like this big economic drive for it. So, those are like two big disanalogies. >> And And to just add on to that, the bomb came well before the nuclear power plant.

So, we were the >> danger was clear and present from the very beginning, which this is the reverse of that. >> That's right. Um I do think that one of the strong analogies, at least at the moment, is on kind of the ability to potentially control the inputs to the technology in order to govern it. And so with nuclear, this is, you know, uranium, centrifuges, the things that you need in order to, you know, refine uranium or plutonium in order to make the enriched materials that you need to build these weapons. Um in some sense, I think in the nuclear space or in the in the AI space that it's even a little easier.

You know, like plutonium and uranium are just rocks in the ground. Uh, building, you know, these precise centrifuges is is technologically sophisticated, but it's not that hard. In the AI situation, right now to train these most powerful AI models, you know, it takes data centers that are enormous with, you know, 100,000 plus of these chips that run training runs for months that take the power of a small city to power. There are only like three companies who really design these chips in the entire worlds. 90% of them are manufactured at TSMC in Taiwan. And there's only one company that makes the machines that we use to manufacture these chips.

That's ASML in the Netherlands. So, in some sense, the kind of primary inputs to training these powerful AI systems, there's like a very narrow point of intervention. You know, we only have one company who makes the machines. We have one company who manufactures most of the chips. We have these three companies who kind of design them. Uh, if international government, you know, if government stepped in to try and fig you know, figure out how to get a hold on these things, one of the main things that our, you know, agreement has in it is to consolidate um, AI chips into large data centers that are monitored and have restrictions on what those chips can be used for.

There are a lot of data centers, but there aren't that many. Uh, and through the combined effort of people who are part of this agreement, finding the majority of those chips and putting them into a monitoring regime would be a lot of work. But like, not like World War II level, you know, uh, work and expense. Uh, much less than that. Um, and then we would have to have some sort of ability to kind of go like, okay, like what is the amount of compute that folks can have that we feel like we're not concerned about. So, the agreement lays out, you know, what is the level of compute that individuals can totally have that like wouldn't need to be be to monitoring cuz they wouldn't be concerning enough for training powerful systems.

Right now, the agreement has something like $500,000 worth of chips. So, I think it says 16 H100 equivalents, which was, you know, one of the powerful chips at the time that we were kind of writing this up. So, that's that's still a pretty small number, but, you know, most people don't individually have, you know, $500,000 worth of AI chips in their basement. And so, I think there's a concern when people hear this kind of like international agreement that's kind of trying to like regulate who can train certain types of AI systems and monitor those chips. You hear some people responding like, you know, you can't tell me what I could do with my MacBook.

I'm like, that is not That is not what we're saying. We're saying for a very specific class of things that people are trying to do with these chips for, you know, a fairly large collection of them that they should be monitored and that we should restrict what certain people should do with them. And so, the agreement says, basically, we've set a threshold. You know, these things would all be subject to change as things progress, but a threshold above which, you know, I think in the agreement it's 10 to the 24, which is flop, which is the number of kind of compute that's going into trading these models that that there's a cap that no one can train models over that um over that amount of compute put into them.

Um which is a fairly low bound. Like, we're already training models ahead of that. The reason we set that at that level was because it's getting as we're using trading bigger and bigger models, we're also getting better and better at training models of the same capability with less compute through algorithmic efficiency, through finding more ways of doing that more cheaply. Um and then we have a narrower range from 10 to the 22 to 10 to the 24, which is monitored. So, if you're doing, you know, training runs in that size, um this international regime would be then using verification and enforcement mechanisms to monitor what those training runs were doing to ensure that they weren't doing the type of dangerous training that we'd be worried about.

Um and that would also be then paired with a certain set of research restrictions, where there are certain types of research in trying to push forward the capability of AI systems that would in some sense be restricted in the same way that, you know, you're not allowed to go and do nuclear research on how to build nuclear weapons. And so, that's kind of like the the the broad scope of it is, you know, either a US or China let led coalition, maybe some sort of multilateral thing if we're in a world where that thing seems more possible, that comes together to kind of put into place these restrictions, likely paired with some sort of centralized effort to try and solve some of these fundamental scientific challenges because we don't want to do this forever, you know, we don't want to actually just stop all AI development of this type of very powerful model forever.

We want to get to a world where we have enough time that the research can be done so that we know how to solve all those fundamental challenges of how to build them in a way that we would be able to understand and steer and ensure that they were aligned with human values. >> So, slow things down enough so that we can work on the alignment problem and to be clear on that, just aligning what AIs value and what their goal ultimate large goals are with human goals and human values and would still allow for advancement in AI in areas that will benefit humanity. So, just trying to minimize the risks while maximizing some of the benefits.

Is that an accurate summation? >> Yeah, that's that's that's a good summation. I mean, I don't want to um you know, short sell that it would be like a, you know, fairly significant intervention. Like taking all of the chips and putting them into this monitoring regime and restricting people from doing certain types of training like is a large intervention. Um there will probably be some costs of AI systems we could have trained that we won't be able to train under this thing. One of the fundamental challenges here is that it's very, you know, people people often say like, when should we stop? You know, do you think that we could train the next generational models that we're in or the generation after that?

And I think my answer is as soon as possible. And I don't love that answer. I don't think the AI systems right now are necessarily, you know, past some very concerning threshold here, but it's very difficult to predict when they will be. You know, [snorts] um insights come at unpredictable times that, you know, have models jump forward in capability and this is often a property of, you know, technology in general. There are cases where people were optimistic about things and it took much longer, but there are also cases I think of the Wright brothers where I think one of them had said, you know, at some point, you know, humans won't achieve heavier-than-air flight for a thousand years and two years later, you know, him and his brother literally the ones who did it.

Or I think Rutherford had said that, you know, splitting the atom or getting the power from the atom is moonshine and then Leo Szilard literally hit it hearing this the next day figured it out. Um, you know, people thought that all these types of AI systems that we're playing with today were maybe 20, 50 years away. Someone came up with the transformer at Google even a few years after that. It was still unclear how big of a discovery it is, but then we scaled AI systems of that kind and now the computer is all of a sudden talking. Um, and so it's really hard to predict ahead of time where some key thresholds will be and certainly getting to AI systems that could automate AI research and development is one of those critical milestones.

And predicting that is really hard. You know, there's a sense in which kind of like we see the AI systems scaling very kind of on trend in some sense. They're getting kind of like smoothly better at a bunch of stuff or at least like in ways that we quantify it. That seems to be the case, but sometimes in that process scaling on trend does produce big jumps in what the models can do. An example of this would be Mythos, which is a model that Anthropic just released privately. Where, you know, all the current publicly available models to date are like helpful at cyber tasks. Um, but they aren't you know, they aren't that great at them.

And Mythos, you know, one new generation of models has gone from like kind of helpful for helping, you know, people do cyber work. You'd call it like uplift, like they're good at uplifting humans to be more capable to, you know, as good or better than most security professionals in the world at identifying vulnerabilities in software from operating systems to browsers to, you know, superhuman at that. You know, there's one researcher who works at Anthropic, who is the company who developed Mythos, who's, you know, world-class this type of thing that when they were doing their announcement had said like in the last week or two, I have discovered more vulnerabilities with Mythos than I have in my entire career.

And so we went from a world where, you know, we were seeing slight improvements to this to a world where Anthropic might literally have more cyber, you know, exploitation, vulnerability discovery, and hacking capability than any nation-state on the planet. Um I don't know where the threshold is for that for AI R&D. And so I think that's the thing that we have to grapple with. And so um we kind of have to pump the brakes on training these more powerful general systems. It's hard to tell where the threshold is. We're not going to agree to anything like this, you know, the political will isn't here today, so it's going to be at some point in the future anyway, but we we kind of have to draw that line well before we're sure we're know we're going to hit some point of no return cuz we don't know where that point of no return will be.

>> And how do you feel about Anthropic's response? I mean, they created this incredible system, like you said, it can be used for hacking or cyber defense unlike anything that's ever existed before. And that's a lot of responsibility. That's a lot of weight to carry. How have you felt about how they've handled it thus far? >> I mean, I think they've been, to their credit, very responsible with it. You know, they they haven't done a large public release. They've set up this program called Project Glass Wing, where they've given, you know, the largest, you know, biggest software companies in the world who are, you know, responsible for developing the operating systems we all use, as well as most of the software that most people use day-to-day, to be able to use the system to find as many of these, you know, bugs and vulnerabilities in their software ahead of when these capabilities were newly broadly accessible.

So, I think that's, you know, to their credit. They're being very reasonable with that. There's some, you know, broader concern of well, often times, you know, what is the frontier capability now, um, something like 6 months to a year after that, often times it's possible for much smaller actors to to train open-source variants of those they might put out into the wilds. Um, Anthropic is being very responsible. Will all the other companies be equally as responsible when they get other dangerous capabilities? Um, and then when it comes to this, uh, automation of AI research and development, that's not a thing that is an external deployment question.

That's an internal deployment question. And so, I think there's this thing that I I wish governments were tracking more of having a sense of how progress towards that was going, because if we hit this kind of threshold where the AI systems are in some sense able to do most of, if not all, of the automated of the research and development at these companies, that's not a capability that they have to ship. Could be enormously powerful in building very scary systems both for their capabilities, and we have to worry about the concentration of power that would put into those companies. And as they might be scaling systems that we really need to worry about loss of control very quickly through that.

And we might not be able to see that. And so, this kind of responsible disclosure thing doesn't necessarily apply in that case. And I don't think we want to live in a world where we have to hope that like, you know, um, as AI companies get more and more capabilities that do rival the power of governments, that, you know, the good graces or the morals of their CEOs are enough for them to wield that power wisely and not in ways that would, you know, advantage them and potentially solidify their power in the world relative to governments. >> Yikes, it would only take one that decides to be a bad actor for and they could say, I have a fiduciary responsibility to use these tools to the maximum to essentially take control of well, to use them in in malicious ways that we can't we wouldn't be able to know what they are.

Um but if it was hacking all of the rivals, uh I mean it's it's not too big of a stretch and I'm not saying that the heads of these companies are evil people, um but you it doesn't take much imagination to see just one of many that would be willing to do that. >> Yeah, I mean Elon Musk with xAI and Tesla, he's currently developing the Optimus robot. Um he's also one of those players in the AI race and is talking about one of his objectives being to getting to like very capable um robotic systems, humanoid robots and otherwise, that would in some sense be able to automate the process of creating more of those things, such that we would be able to automate the physical task of building more capable AI systems and robots.

And, you know, there's a bunch of ways where you could use that to make the world enormously more prosperous, you know, if you could automate a bunch of how many factoring works, that would drive the, you know, physical costs of good close to, you know, basically just the cost of the material inputs >> Yeah. >> uh as opposed to all the other things, but you also have to think about, well, does that mean that there's, you know, a company and a CEO that essentially has a robot army? Um >> Yeah. >> That's a whole other problem to deal with. Um you know, and that's short of all these loss of control concerns that we're talking about.

And there's an unfortunate dynamic here where, you know, like a good guy with a super intelligence, if they could control it or steer it, unless they do something with it, doesn't stop someone else from making a dangerous super intelligence that we lose control of. And so, it's not just about someone being first to use the technology judiciously and carefully. We have to ensure we're in a world where, you know, all the people who are capable of developing it can actually develop a technology that they know how to control and will use it in a, you know, pro-democratic, freedom-preserving, pro-social way.

>> Yeah, I mean, we're touching on a lot of different dangers here. I mean, there is the AI system that gets out of control on its own. There is the AI system that is used by one of a few people and there's too much power concentrated in too few hands. And there's also the scenario where there is a transition, even when things go well. Let's say AI does a lot of work for us, whether it's with robotics or doing all the cognitive work and the people that are in charge don't manage that transition well and suddenly everyone that doesn't work at one of these half a dozen companies has no economic prospects for the future.

Uh I think that's a a very scary one that a lot of people feel right now. It's well, is this thing coming for my career and my opportunity for the future? And do we have any sort of plan in place if that were to happen across the board? >> Yeah, I mean, one way that I often find myself talking about this is I think overall there's a fundamental fundamental missing mood about like really trying to grapple with what's potentially about to happen in the next 5 years. Um if we succeed, you know, I I People often talk about how we achieve AGI. Like, when will we achieve AGI? I think that, you know, as we get closer to something like that, the term becomes less useful because it's kind of more useful as an intuitive target in the distance than it is to try and measure when we've crossed that threshold.

It's more important to look at capabilities. But let's just say we had an AI system that could make all cognitive labor economically redundant. You know, maybe we haven't solved all robotics yet, but we've got an AI system that essentially, if you work at a computer, it can do all the things that you would be able to do at a computer. That's not going to, you know, take everyone's job overnight. You know, there you know, labor is sticky. There's, you know, actually plopping these things into different organizations takes time to figure out how to how to do. But in principle, those things can do all the things that the humans can do.

Uh it's not like another technology where it's automating one particular type of physical task or one particular type of cognitive task. It's a technology that in principle automates automation, that automates all the things. You know, most of the ways in which the world is bottlenecked and companies are bottlenecked is having humans that they're that are capable of doing the thing to do the work. If we remove that constraint, um that is going to drastically change the economy. If that was I would love for that to be the only problem that we had to deal with. Like I think there is a incredible future that's possible on the other side of that if we reorganized our social contract, that we found a way to, you know, live prosperous lives where we had the freedom to pursue all kinds of interests that we wanted to pursue and didn't have to be beholden to having jobs, that there would be an enormous amount of wealth that would be generated if we could organize society in a way to take advantage of that.

>> [snorts] >> Um but we're not even grappling with that. And I think that that is just like the, you know, one of the most obvious consequences of the trajectory that we're on. Um let alone, you know, these dual-use risks, these concentration of power risks, these loss of control risks. And so I'm often just trying to get people to take the technology seriously. And I think once they do, they start to turn on these things. They start to think about the implications. They're less likely to be, you know, like, "We'll just have retraining programs." And I'm like, "Retraining to what?" You know, if the world is changing so fast around us, will people have time to retrain to the thing that then will be automated again in another couple of years?

Maybe some jobs will never be automated, you know? Maybe we'll always want elder care nurses to be humans. Maybe there'll be a world in which we all find some sort of human relationship vocation that is still economically valuable because of what humans care about. That is a very different world than we have today. And people point to the past and point to how, you know, there have been lots of transitions where technology have changed job. If you went back to like the late 19th century, early 20th century, you know, 80% of something of people's jobs were in agriculture. And they look at podcasters and be like, "What is How is that a job?

What even is that? How do you make money like Who's making the food?" >> Right. >> And the Industrial Revolution really kicked off a lot of big changes here, but that change kind of happened at least somewhat gradually over the course of 1 to 200 years. Whereas we're talking about a change that would happen much faster than that. And so, anyway, all that to say, I think there's a kind of like maybe we'll hit a wall, you know? Maybe all these things that I am worried about, that other folks are worried about, are further down the line. Maybe they're 10 years away. Maybe they're 20 years away. I don't think they're 50 years away.

We still need to grapple with them. And we think we need to take the reality seriously that they could be 2 years away, you know, 5 years away. What are we going to do with that? And I think if we start to grapple with even those questions, the gears start turning on, "Okay, well, if it can do that, then it can also do these very powerful things. It can If you know, if it can solve cancer through doing a bunch of biomedical research, those same systems can build bioweapons. If they can help us patch our cyber vulnerabilities, they can help us exploit our cyber vulnerabilities. As they get more capable, if we keep seeing signs that we kind of don't know how to instill them with the values that we want them to, let alone really being sure what that is, then we'll have more of an appreciation for loss of control.

And so, I think there's kind of a like the world needs to wake up. This isn't There We're not We're not in a timeline where this is not going to be a big deal. We have some agency on whether it's an amazing big deal or a catastrophic big deal. And there's a lot of hard challenges we have to go through, but that is the world we're living in where we need to pay attention. We need to start acting. You know, everyday people need to start doing what they can in their daily lives, whether it's just having conversations with their neighbors, whether it's just calling their congressperson, you know, whether it's, you know, applying pressure in other ways peacefully to to try and, you know, the sum of millions of individual actions can help kind of the world wake up here to to grapple with the challenge that we're facing.

>> Absolutely. Along those lines, what did you learn from the conversation that the authors of If Anyone Builds It, Everyone Dies, uh that they had with Bernie Sanders? >> I mean, I think the takeaway that I have from that, as well as, you know, we've been meeting with many policy makers over the last couple of years, is that um just like most people in their everyday lives, you know, policy makers, people who are in positions of power, also, many of them haven't really grappled with these questions and many of them when they do take the time to um start to understand that, you know, the stakes are huge here um and start to try and think about how to take action.

Some of them are braver in talking out about it than others. There's a sense of what, you know, this still kind of seems like a crazy thing to talk about because of how big it is and how big some of the risks are, but some of them, when they hear about it, it does just seem real to them and that they they are either, you know, have the um the freedom to talk out about it given their position or they have the courage to. And every one of them that does that makes it easier for others. And so, that's kind of what I learned from situations like Bernie is that thoughtful people when they engage with the actual, you know, arguments here and and the evidence, uh you know, it there's a lot of common sense concerns that they can.

And so, one of the reasons I, you know, have some optimism here is most people who are empowered to potentially help haven't really deeply engaged with this stuff. They're busy. There's all kinds of you know, the world is full of problems that need to be solved. Um and uh many of them realize this is one that they should be focusing on, too, when when they take the time to really dig into it. >> Yeah, I speaking to the power of a small number of people to affect big change. This is happened over the last couple of years. It's a minor issue, but I was really it it it was very much impressed upon me how a small number of people can have a big impact.

Here in California, after the financial crisis, the University of California system, they changed their admission policies to admit a lot more out-of-state students. They I think tripled the number of out-of-state students they were taking because out-of-state students have to pay a lot more in tuition, and they were trying to make up a budget gap. And so, now it's 10 plus years after the financial crisis, and they still have those policies in place even though they were no longer needed. And the there was a local uh state assemblyman who I was in touch with and and we were discussing this issue, and he said, "Yeah, it's kind of incredible that it started with just a group of mothers who then started contacting other mothers who were in California had high school students coming up." And they said, "You know, we think we should go back to where it was.

Let's organize a campaign to reach out to local representatives." And it just started to spread. And it's not like this became a movement that took over the whole state, but this small group of people got other people involved. They reached enough representatives that now in the state Congress is making a They're They're already moving forward with changing all of this. And I think it's a small example that probably isn't all that consequential to all of society, um but shows what can happen when people start getting organized. And I know there are a lot more people that do care about this the issues that we're bringing up here today.

And while it can seem like you're sitting there and there's nothing you can do about it because you can't really as an individual When you do start speaking up to representatives and can start getting organized and get other people involved, I think then representatives will listen. And I am still optimistic that if those representatives, just like my father-in-law this morning, when some of these things are explained to them about the how the systems are grown, what has happened so far in training environments, and what that might consequences that might point to you down the line, suddenly they get just as concerned as normal citizens do, and we can make a difference that way.

>> Yeah, I mean, public awareness and you know, the public actually reaching out to the representatives, I think does make a difference. I don't know, kind of sounds trite or something, but I talked to many, you know, members and their staff, and many of them say, you know, it it actually makes a difference how many calls we get on things. And most people don't call offices on most questions they care about. And even just them having an influx, you know, sustained over time of like people actually being like, "No, this is a real issue that I'm worried about." Policy makers can try and do things in the world based on their own concerns, but it certainly helps them when they feel they feel like their constituents are also paying attention to this and care about these sorts of things.

And I do think that's like very important. And an undervalued way to really try and do something. Even you know, we all have to live our regular lives. You know, like not everyone's going to be working on AI safety or running MIRI or working at one of these organizations, but we have the opportunities to talk to people that we know in our lives and share our concerns and as a population starts to like express our concerns through the ballot box and through contacting our representatives and that that is the way and and things can change quickly. You know, I I think one response that a lot of people have when they hear about this is they're kind of like, "Oh, it's all inevitable.

You know, um there's nothing we can do. You know, the companies always are going to do what they're going to do. Like you know, no one can control big tech. Uh all these sorts of thing. Like there's a certain fatalism and I totally understand that cuz it does feel that way sometimes. I mean, one thing I I try and remind people of is that there's there's been cases like this where it seemed very bleak in the past. I think one of my favorites is you know, if you went back to the '50s, there were a lot of smart people who were looking around and like, "We're Like the bomb is out of the bag. You know, it's only going to proliferate when you would like you know, be at that vantage point looking back at human history, you were like, "Wow, we sure seem to be really into fighting each other." And then you know, then we had a whole world war.

And that was awful. And we you know, we were like, "Never again." We made the League of Nations. We were like, "We're never going to do that again." And immediately we fought another world war and invented nukes. And I think most people sitting in that situation would be like, "It's just we're screwed. Like >> Yeah. >> it's just you know, eventually the bomb's going to spread. They'll have more world wars. They'll start firing them. We'll have a you know, nuclear winter or nuclear apocalypse. It's inevitable." >> [snorts] >> You'll notice it didn't happen. You know, we we got lucky on the way. There's a great story of you know, took a particular movie just at the right time that Reagan watched called The Day After, which is a made-for-TV movie about the aftermath of a nuclear strike.

Um that really shook him. And uh, you know, after being depressed for a couple weeks, really kind of woke up and was like, "Wow, this is a risk just as much to us as it is to, you know, our adversaries." And was one of the key things that got him to meet with Gorbachev and kind of be like, "What the hell are we doing here? Like, we need to step back from the brink." And there was also a lot of people in the US who watched that movie and also kind of were expressing concerns. And so, things can change very slowly and then very quickly. And we're more likely to get that change if more people are engaged, more people are expressing their concerns, more people are calling their, you know, policy makers, more people are making media and content about this to help raise the salience of it.

Like, that is the way through. >> Quick pause. This is important. There are only three things you can train in life. Your craft, your body, and your mind. Most people work on the first two and just hope the third one shows up when it matters. That's why I'm recommending Finding Your Best by Michael Gervais. I've done the course myself. It's excellent. It's not motivational, not fluffy. It's structured to train the skills that actually drive performance, confidence, recovery, trust, grit, clarity under pressure. If you ever feel capable on paper, but inconsistent in real life, this is about closing that gap.

I don't put my name behind a whole lot of things. This one earns it. You can find the link below. No urgency. Just a genuinely solid tool if you're serious about getting better. Malo, how are you, just to change gears a little bit, using these tools today in your work and in your life? >> Yeah, I mean, pretty deeply. Um, so, basically, my work looks like I have a work space um, that is set up um, basically just a folder on my computer that I use with Claude Code. And it's structured in a way it's an Obsidian Vault, which is basically just a program that lets you view markdown files nicely. AI systems are really good at working with, you know, working with English words and text.

And so, them just having a place to write notes. Um, and it has places where, you know, it's tracking all of the different relationships I have. It's tracking all the different open projects I have. It's tracking all the different areas of responsibility that I have. Um, has a bunch of different sections like this. And it's set up in a way where its instructions are both to, you know, uh, work with me kind of as a collaborative partner, but also that as part of that process, improving that workspace, improving the different skills that we've developed that are, you know, things that Claude Code and other tools can have that are kind of like instructions on how to do particular tasks.

As we do work in that space, Claude Code and I together, that it's kind of proactively trying to improve those processes to make them to notice when we've learned something, when there is, uh, you know, a problem that was solved that would make it easier to do the task the next time, and to to update all the things in the workspace to keep it fresh and most capable. And with that, you know, it's plugged into my emails, it's plugged into my messages, it's plugged into my calendar. It's plugged into, you know, our work Slack. And so, um, basically, I I also think that like rambling to the AI system is actually quite helpful instead of trying to kind of like pre-edit and type to actually just have a text-to-speech and to just kind of be able to think out loud about problems and work tasks.

And so, I just kind of have a combination of that where a large part of what my work looks like today is I just sit down, boot up Claude Code when I start a session, it's already got context on the projects I'm working on, the things that need to do next, skills for how to work with my email, skills for how to work with my calendar, skills for how to work with my messages. We've got processes to kind of triage my inbox, prioritize projects that we're working on, um and so I just basically ramble the Claude code. It goes out in the world. It looks at a bunch of stuff. It does a bunch of research.

Uh it's very much like kind of a very collaborative relationship. >> And do you have agents then created with Claude that will then examine your emails and give you and your Slack and your calendar and give you feedback on what you're doing, how well you're doing it? >> Yeah. Um and produce outputs based on that. So, I have an agent that lives outside of Claude code that every time an email comes in categorizes it based on a few things. Um it'll apply some labels if it's a certain category of email that I want to be tracking. Um it applies a needs action label if it's something that needs my direct attention as opposed to something else at Miri.

Um and then in my Claude code setup, I will have uh you know, things like for one of those labels, for things like this podcast, for media opportunities. Miri gets a lot of different kind of inbound for that kind of thing. Some of them are for me, some of them aren't for me, but I kind of want to know the pulse on that. And so, all media-related emails get labeled and get shoved out of my inbox so that I don't have to see them. Um you know, uh as noise in my inbox, but I have an agent in Claude code that checks that label when asked to and produces a structured report with the things that need my attention at the top about media opportunities you know, relevant to me.

Uh and then a section below with kind of descriptions of the other things that are going on at the organization. So, instead of kind of having to like look through, you know, 10, 20 emails that come in daily about various things like this, I kind of just get a nice summary every day that gives me the headlines of what I need to be tracking and what other things are going on. I have similar things for kind of like all the finance stuff that's going on at Miri where I like don't need to be in the details, but I like to kind of be kept up to speed to it. So, a similar setup does that for those things.

Across my messaging platforms, I use something called Beeper which centralizes them, and I've created a catch-up process that basically reads everything unread across all of the different messaging platforms I use and creates a catch-up report that's kind of like here the kind of the direct DMs that are waiting on you. Here is another section which is kind of like threads you should be following, you know, conversations in the MIRI Slack that you might be want to be aware of what projects are going on and how they're going. And then as part of that, you know, there's a bunch of signal groups I'm in that have like 100 people that are popping off all the time or different Slack channels.

And I have a pulse section which kind of across them summarizes like what are people talking about in these places? Like what are active points of conversation that you might be interested in? Because uh for most of those things that I just get too many messages from too many places I just can't process it. And so I have a bunch of stuff like that. I have special purpose agents that have special jobs that kind of there's a thing I do repeatedly and instead of you know, doing it over and over, an agent can just go off and do that in the background and come back and produce the results. And then I have some that are kind of working in the background kind of doing simple tasks over and over again.

We have another one at MIRI in general in our Slack where we have Google Alerts for, you know, keywords like Machine Intelligence Research Institute, the name of the book. And the the Google Alerts themselves, you can get an RSS feed, but often times the links are a little buggy or the, you know, articles are paywalled or other things. And so there's a background agent that runs on that RSS feed, gets everything, make sure it's not already posted to the channel, gives a nice summary, extracts some quotes, finds a good link if the link is broken in the Google Alert. And so I'm always looking for opportunities to find things that we're doing that kind of could be better or someone's doing it's just a rote task and finding a way to get agents to kind of do that type of processing for us, especially in like low-stakes situations where if it makes a small mistake it's it's not a big deal.

>> Yeah, I am just catching up on some of those things you're talking about. It was I had a conversation with Wade who's the CEO at Zapier and they've really made an AI first transition. And he mentioned creating several similar agents to to what you're talking about. I thought it was interesting. He feeds his inputs from he uses granola for all of his meetings to actually record all of his meetings live and then feeds that plus the email plus the calendar plus the slack into basically what's like an AI board that then with certain personality profiles and then they give feedback on what was the quality of the work for the day, for the week, for the month, what are things you could do different, what are areas that you might have that were interesting things to explore or go deeper into that you might have might have missed might have slipped through the cracks just with being a human and as man as I have started to use it.

I mean we probably get Oh gosh, I mean I think I had 30 some odd emails sitting that were related to various sponsorship opportunities and most of those were from companies that don't align with what we're doing or products that we would necessarily recommend and it but it takes a lot of time to go through them and it's important to the financial health of the show to take all of those really seriously obviously and wow, creating an agent that will just parse them, summarize them, your attention you need your attention on these three, you can ignore these five and then these 10 in the middle maybe take a look at them for yourself and you decide and these are the pluses and minuses around them.

>> [snorts] >> Wow, has it made that one tiny piece of working life so much easier and so much faster. >> Yeah, it lets you focus on the things that you know are actually important that actually matter. You know, I should caveat here that you know, it's still kind of the wild west for a lot of these things and a lot of the most powerful applications of them I think for folks who aren't that technically inclined can definitely be very daunting, you know, connecting these tools to your email, to your calendar. Uh you certainly don't want them to have the ability to just like send emails without your, you know, uh you know, consent.

And so there is some amount of, you know, tech know-how that is required at the moment to figure out how to use them in ways that are effective without opening yourself up to a bunch of vulnerabilities or letting your you know, agent accidentally delete all your emails or start sending emails on your behalf. And so just to be clear for anyone listening, uh I have all these things plugged in. Uh I've got a bunch of permissions set up where like the AIs can't, you know, the it's not up to the Claude about whether it can send emails. The Claude code harness itself is restricting what the agent can and cannot do.

Uh and that and that changes over time. But I think it's very useful to to get find some kind of first experiments that are pretty self-contained, that are pretty low stakes, and to experiment either with using Zapier, there's another great platform called Tasklit, or you know, Claude co-work, um which is a little bit more of a user-friendly version of Claude code, and to to see if, you know, you can kind of make that task more efficient or automate that process. And then, you know, as you play with the tools, you get to know them more. You figure out you can get more ambitious. Um but I've seen one big failure mode is people kind of jump in too deep, too hard, and they get kind of like, I don't know what's happening.

All these tools are asking me for all these permissions. I don't understand. And trying to write, you know, run commands on my computer that seem confusing. Um but I think the, you know, the tools will only get easier and easier to use. And uh the way to do it is, yeah, to start to start small, to notice small opportunities, and to like build them over time. >> Yeah, and that experience will give you a sense of a loss of control over some of what you're trying to do at times or just see what's possible already. Probably a lot of things that most people don't realize are possible at this moment in time.

On the near term, what do you what do you think is the most exciting frontier of AI tools right now? >> I mean, I think we're close to a threshold. I think there was kind of an inflection point late last year where there was kind of, you know, except for the programmers, uh which I also do some programming, but except for the programmers, everyone was kind of like, I don't know. Like they're good at answering questions and doing deep research, but like what's the big deal here? Like everyone was excited about agents, um but they don't seem that useful or something. And there was some inflection point where the the latest version of the models with the the latest versions of things like cloud code or codex started to kind of really the models were good enough that they could apply more judgment and they would make less stupid mistakes, and they could run for longer such that a bunch of knowledge work tasks became much more feasible to to work with the models with more.

Um and so I think like programming would have been the thing that I would have said, you know, a year ago in terms of like, yeah, they're helping people with like as like fancy autocomplete, but it'll you know, in the next year the AI systems will basically be doing most of the programming and we'll be babysitting them. Um and now I think that kind of one of the big frontiers that most people will notice is uh a lot of the capabilities are there to help people do a lot of the daily tasks that they need to do. Um but kind of the wrapper around them, the the harnesses, the products are really just starting to catch up to making them usable and approachable to to people who aren't very technologically, you know, adept.

And so I think that's that's one big thing that we'll see um for kind of everyday people using these tools is things like Claude co-work will will go from experiments to like very easy to slot work into and a lot less scary to use and the models will get more capable. Um and so, I think for for for folks who are looking for ways to adopt AI, that's that's the thing that I'd be very much paying attention to. And if you if you start experimenting now, it'll probably be clunky in a bunch of ways, but um you know, all the companies are shipping very quickly and uh a few months from now, I think, you know, we'll have better models, which will mean you'll have to give them less specific instructions and also all the infrastructure around them will make them easier to use and safer and all those sorts of things for these like knowledge work tasks.

>> And as you dive into it, make sure it can only write drafts of emails for you, not actions in them. >> That's right. Uh you know, if you're experimenting with Claude code and you get annoyed with the permission and your buddy tells you about the flag in Claude code called dangerously skip permissions, uh avoid the urge. Um you know, you it might seem fine, but if you let your AI agent run for a while, um you might be surprised what it gets up to when you're not watching. So, um you know, friction is part of the process. Friction helps you understand what how these tools work, what different permissions are, what where they go wrong, where they go right.

It can be annoying, but definitely don't let them rip. They're they're definitely not not there yet for for the work that everyone wants to do day-to-day. >> And then for Mary, what are the next What's the future look like for Mary? What are the next steps that you guys are working on? >> Yeah, so Mary basically has kind of I think you can think of two focuses or uh you know, departments um these days. There's our comms team, which last year uh it was very easy to prioritize. You know, we had this book project. Um there's lots about trying to make a book successful that you can kind of throw, you know, labor at.

Uh you know, smart generalists can get lots done and so it was very easy to prioritize. It was just you know everyone should be trying to find a way to make the book go well and that I think we did a a great job but that was kind of like our big focus but that the comms team's focused more broadly is just on trying to get the world the general public you know you know the elites and policy makers to kind of understand you know the the situation with AI as we see it the the the risk of extinction all that sort of stuff from the kind of theory of change here that this is you know a global problem and unless you know a lot more people realize that the zeitgeist really understands that it will be hard to actually do that the kind of big international agreement stuff that we need to do.

And so we've grown that team and we're currently experimenting with what you know we don't have some new big exciting project we're trying to you know brainstorm what that might be what some big post book project would be but we're experimenting with a bunch of stuff on you know different forms of getting our arguments out there in different ways in short form video other stuff being better at you know maybe you know getting up heads out quickly all that sort of thing so in an experimental phase both exploring what the next big project would be and all the other ways we can maybe kind of be engaging with the world and helping get the message out there and working with content creators and all that kind of thing.

Um on the governance side um it looks I think a lot like more of the same so so that team I kind of think of their priority as being um you know working with policy makers and doing the research to actually figure out what the world would need to do in order to be able to prevent the creation of super intelligence until we know how to do that safely. I'm not sure which team has the harder job. They're both kind of like you know pretty challenging like big picture you know goals or focuses and so that work looks like everything from continuing to squint at our current you know international agreement look at the parts that were the most confused by or we think need the most work or you know anything like that, all the way through to meeting with policy makers, talking about the big picture things here, trying to educate and inform them.

Um, we're moving more into um uh like non-partisan research on what kind of concrete nearer term uh incremental steps would look like. You know, we've been meeting with with staff and members and people in the administration for for a couple of years now, and there's more and more of them who are like, "Okay, like this seems like maybe it's real. Like, what should I do?" Um and, you know, they're they're not able to pass the treaty tomorrow. And we can say a bunch of more high-level abstract things. A thing we often say, which I agree with, is that talking about this at all is actually enormously useful.

Um, you know, you don't have to pretend you're an expert if you're not, but even if you're just speaking out and saying like, "Hey, the leaders of these companies, you know, two of the founders, two of the three founders of deep learning, like the godfathers of deep learning, one of which is the most cited living scientist of all time, the other one is a Nobel Prize winner, take this seriously. Like, shouldn't we be taking this seriously? Shouldn't we be trying to figure out if we can pump the brakes, how we would do that?" I think that's actually an enormous service, and the more policy makers who speak up, the the easier it is for others to also talk about it.

I think we're kind of getting close to a lot of jam breaking here, where there's a lot of people who are concerned who aren't talking about it. And as more people start to speak up, like Bernie, like other people, um they'll they'll they'll be more of them to pop up. But then also, yeah, they're they're kind of like, "Well, tell me what I can do today." And so, we're working on kind of what [snorts] I like to frame as kind of like, you know, one spicy to five spicy. Um, if you were to kind of put forward some proposals or some legislation, like you know, what would that look like that might help with a bunch of other things, even like the dual use misuse risks, but would also be helping us build towards a world where we would be able to, you know, more easily start to implement these uh international coordination mechanisms and and enter into these types of international agreements.

>> Yeah, and I think your average citizen would be completely flying with a little bit of a slow down here because most people are just trying to catch up with what already exists making the most of what is already out there to help them with what they're doing and I think as more people become aware of the risks associated with really accelerating into this and especially that threshold of models being able to work on themselves essentially the next version of those models which would rapidly push the acceleration at least in theory. >> [snorts] >> I I I see a massive majority of people would be completely comfortable with a bit of a slow down and the simultaneous just directing more of the acceleration towards areas that are much more high percentage to be net positive for all of humanity and going, you know, let's absolutely fine to slow down on these things that we know would be absolutely dangerous like anything that could help create new biological pathogens or weapons of that sort.

>> There's a way in which, you know, setting aside some of the loss of control stuff if we're thinking about this as a normal technology for every technology there's been a sense in which, you know, like some transition is going to happen. Some people are going to be winners and some people are going to be losers. I do think in this case it's likely to happen and is happening kind of like faster than anything before and I think you know, even the folks who aren't worried about the stuff that I'm worried about at the end of the day with these big picture loss of control things are still looking around going like, man the labor stuff, this other stuff, the power of the companies is all making me feel really uncomfortable and when they're kind of looking to, you know, our government aren't feeling like they're hearing messages that they've got this under control or that they're really, you know, working on finding a plan that makes it work.

And so, I think, you know, there's a lot of reason for a lot of people to to feel unease about a lot of these questions. Um and hopefully, if that, you know, continues to spread, that will that will motivate, you know, people in positions of power to be like, "Okay, you know, I need to actually look into this stuff. I need to understand it's real cuz it's not just, you know, uh policy wonk curiosity. It's something that the general public actually is demanding of me." >> Yeah, and worrisome when we look at the let's say the response to issues with social media, which has run on for years and years and years, and plenty of concern amongst especially for parents of young people, and pretty obvious that just setting an an algorithm to always push for engagement rather than, say, helping people learn more about truthful facts in the world, but always pushing for engagement would of course have all these unintended consequences.

And yet, the response was so slow, continues to be slow. It doesn't That is one that does not fill me with optimism that government has a tendency to respond in quickly enough as these risks pop up. I'm hoping this will be different though because of the immediate risk just in terms of labor. I think a lot of people will get upset real quick as jobs start to shrink and disappear as a result of this technology. >> It certainly is my impression that governments are more quick to act when things slap them in the face. Sometimes they don't even act then. I think there is a there's a sense in which social media kind of crept up on us, and it was unclear at like what point in time taking some sort of decisive action um would be warranted.

I think that's kind of like a curse and also a virtue of the stuff with AI is that by going quickly, I think there is a case sense in which it'll be more likely to like produce large change that might kind of like, you know, wake people up to the fact that something decisive needs to happen here. Obviously, the flip side is that when things are happening quickly, it's also hard, you know, that the the powers of governments are often slow to act and take time or something. And so, it's kind of like a a double-edged sword. I also think that um another way in which the AI situation will be different is that there are these kind of, you know, more diffuse so you know, society-scale effects like the issues with labor, like other ways in which AI systems will kind of, you know, as we automate more of the world around us, will change things.

But, there are also like a bunch of like just very clear national security concerns. And I think the kind of national security establishment uh definitely kind of overall has more of a uh a certain type of focus and proactiveness when they do even sometimes um they're very used to the even thinking about, you know, kind of like low probability, high consequence risks. And so, I think there is a virtue here where like a lot of the things we're most excited about in terms of what AI could do could also in the wrong hands be used for, you know, pretty scary ends. And that is a national security question.

And so, I think there is potentially the uh the opportunity for kind of um you know, the US government and other governments to to to wake up to this a little bit more quickly because those those kind of motivate focus in a way that sometimes these slower, more diffuse uh societal problems don't uh you know, engage that part of of government thinking. >> Yeah, that's an interesting way to be optimistic in that there's a potential for more danger more quickly and yet that might be what pushes regulators and government to do something about all of this more quickly. >> Yeah, I mean like I said it's a it's a double-edged sword.

The the more quickly has its own concerning bits. Um but um yeah, certainly when change is very apparent and when the stakes of that change are high and when it intersects with national security questions, I think that can capture government's attention and potential for action in a way that um other other things often fail to. >> If you got 60 seconds with world leaders what would you do with that 60 seconds with them? >> Well, I would hope that I had more than 2 seconds notice. >> Yeah. >> Uh and probably say something a lot smarter than I'm going to say right now. Uh but I think it would be you know something along the lines of you know, it's easy to kind of look selfishly at the technology and think that having a lead here is beneficial.

You know, nobody wants to fall behind, lose all the economic benefits, lose all the strategic benefits. But it seems like we're in a situation, like I said, where you know, we're in a race to be the first to lose control. And whether, you know, you Xi Jinping get your super intelligence before the United States and Donald Trump do, it won't matter cuz your super intelligence is just as dangerous to you as it is going to be to Donald Trump and the Americans. And so it's all in our interest sitting in this room to figure out how to coordinate in order to get to a world where no one races off that cliff and to find a way to, you know, compete and do all the things where you know, we're not always going to be friends and do all the competition while setting some very clear red lines and doing the agreements we need to do to uphold those lines and verify and enforce them and build the trust that we can believe that we're doing that.

You know, we can't just do it on faith. We have to set up the mechanisms like we have in our agreement to be able to tell that we are actually, you know, complying with those things. But this isn't optional. And this is a problem, you know, I'm probably over my 60 seconds, but you know, if they're still listening and they're not cutting me off, I'll I'll keep going. This is a problem that's not going away. Um I think there's a sense in which as humanity continues to reach, you know, further and further into technological maturity, it will only be the case that we'll develop more technologies that will allow smaller and smaller groups of people to have larger and larger impacts on the world.

And we have to figure out how we come together as a global society to find a like freedom-preserving, prosocial way to govern those technologies to allow us to do all the other things that we want to do and avoid the kind of like totalitarian versions of that and find the good ways of doing that. AI is, unfortunately, you know, really giving us a deadline on figuring that out. Nuclear weapons was probably the first such technology that really started to, you know, bring home that, you know, sets of humans could have global impacts that would impact anywhere anyone all over the world. Um that problem's not going away.

It's not just an AI problem, but AI is going to force us to figure it out and figure it out real fast. >> That's pretty damn good, especially for being put on the spot, I think. >> I mean, you know, I talk about this stuff enough, so I got I got I got some some stuff cached. >> Yeah. Yeah. With the treaty, so some critics might say that you're just going to then be able to monitor the major labs and drive this work underground. You'll only slow down the honest actors in that type of situation. What's the rebuttal to that argument? >> I mean, we try to grapple with that. We're we we're we continue to grapple with that.

Um so, we've tried to set up the agreement with our best sense of right now with the thresholds that we have in place that, you know, with what it currently takes to develop this sort of technology, um that actors would not be able to have the amount of compute required to do kind of the type of dangerous AI training that, um we would be worried about. Um we continue to, you know, these are difficult questions, totally. Like, we, you know, just did a round of research on distributed training, which is the, you know, instead of having one big data center, can you use a bunch of compute distributed across the world to do chunks of that and bring it together to train a very powerful large AI model.

Um there are a bunch of ways in which this is less efficient, but people are getting more and more clever. So, we recently updated our agreement to try and factor in some modeling we did of trends in distributed training. And so, there definitely some challenges there. Um some of the research restrictions are also targeted at this to try and make it be the case that, you know, just like with certain types of nuclear things, that, you know, it's it's difficult, you know, you're not going to monitor everybody, but you certainly reduce the likelihood that it happens if certain things aren't aren't permitted and there are consequences to doing them.

Um and it is not durable. Like, you know, we're certainly not in any sort of, um illusion that this is a thing that can go on in perpetuity. You know, people will find algorithmic efficiencies in the privacy of their own minds. Um we might get surprising things that happen with just people doing, you know, certain types of training below our monitored thresholds. Um the the agreement, you know, might need to change over time and be in contact with realities of how things are shifting. Um it's probably not a thing that's durable for, you know, 50 years. Um but let's the the idea is that we can kind of get something in place that pumps the brakes on the most dangerous things that we're doing here and gives us the time to solve some of these fundamental problems so that we don't need, you know, to put these restrictions in place because we've we've we've dealt with the big problems.

But it is a hard problem, you know. Uh there will be incentives to drive things underground. We have to be worried about, you know, dark data centers. Um you know, how much compute can someone assemble somewhere that would be unmonitored. Um I'm somewhat optimistic about that. I think intelligent agencies are pretty good at that. And when we're talking about these very large data centers in the current compute demands, it's it's it can be pretty hard to hide. But I'm not I'm not shying away from like it's a big challenge. Um I do feel optimistic that, you know, I think the thing that we've sketched out here is a pretty good first start.

I'm also under no illusions, you know, we're not international diplomats. We're not experts at verification regime and enforcement. Um to the extent to which we get to a world where even considering this, there will need to be the input from all of the experts across all of the world who, you know, are experts in this figuring out the right ways to do it and to make it most effective as possible. I expect they'll find a lot of things that, you know, our crack team of five researchers at the time who put this together um uh kind of didn't identify. But that's, you know, part part of the challenge and part of the way these things get done is you you start small, you build up on it, you find better and better solutions, you get more and more experts engaged, and we we, you know, through negotiation and conversation and research find the ways to solve these gnarly problems.

>> Yeah, and this is a first step in the right direction. >> Yeah. >> How much were LLMs used in the writing of the treaty? >> [snorts] >> I think very little. Um different people on the team at MIRI have different I feel like I'm good at getting the models to write things in ways that I like that people think are pretty decent writing and other people's who are not and find that when they produce things are just like this is obviously AI slop and everyone can tell it was written by an AI. Um certainly in terms of doing research for these types of things, you know, they're enormously helpful. You know, you don't just take outputs from you know, LLMs doing deep research and other types of tasks and pump them right into things.

But when you're trying to internet design an international agreement and understand all the other kind of precedents that were put in place like you know this paper that we're talking about international agreements to prevent the premature creation of superintelligence has you know literal article text for like the main parts of the agreement along with precedent and um motivations for those things um LLMs were enormously useful in kind of doing that research and and helping us know where to look and helping us understand which precedent might be useful or structures might be useful and all sorts of things so um lots of tokens were spent uh I'm sure in in you know producing the finished report uh I don't think that many tokens are present in the finished report in terms of the actual prose.

>> Yeah. Yeah. Um Well great. I if anybody wants to learn more get more involved, see what else is going on at MIRI, where can they find more about you online and the work you're doing? >> So our website is intelligence.org. Um you'll be able to find our our governance work there too. The governance team has their own subdomain which gets you right there. So that's techgov.intelligence.org where you can find you know different reports like this international agreement that we have put up there as well as some blog posts related to a bunch of those concepts and other work that we've been doing. I think we just recently published initial stuff on that distributed training work that we did.

Um you can find MIRI on most of the major social media platforms. Uh I think on Twitter we're at MIRI Berkeley. Uh my Twitter handle is awful and is like @m_borgo my last name. Um you know terrible mistake maybe I should change it. Um but yeah basically if you search me on most social media platforms Malo Borgo is not a common name. Malo on its own so you'll you'll be able to find me Uh MIRI or MIRI Berkeley basically wherever you get your social media if you want to keep up with what we're doing. >> Great. We'll put all those in the YouTube and podcast platform descriptions. So, that if anybody wants to find it, they can click directly on >> course, you know, the the website for the book, which has a bunch of online resources as well.

It has an act page where people can sign up to figure out how to contact their uh you know, member of Congress, etc. So, that's if anyone builds it.com. >> Perfect. Perfect. Well, Malloy, thank you so much for your time, your energy, and most of all your thoughts and perspective on this because I think this is a really this is the critical issue of our of our time, and you're really doing something about it. And so, big thank you from me and the people listening today. >> Uh you're welcome. Thanks uh thanks for having the conversation. It was a good time. >> All right. >> Absolutely. >> [music] >> Okay, everybody.

Until next time. Ask questions. Don't accept the status quo. And be curious. >> The next daily show.

Clips from this episode

Short highlights — tap to play, swipe for more.

Watch the full episode on YouTube →