← All episodes

Dec 10, 2025

Ex–Microsoft Insider: “AI Isn’t Here to Replace Your Job — It’s Here to Replace You”

Nate Soares

Watch on YouTube Spotify Apple Podcasts

Episode summary

Nate Soares, Executive Director of the Machine Intelligence Research Institute and co-author of If Anyone Builds It, Everyone Dies, makes a chilling case that today's leading AI labs are not building chatbots — they're explicitly racing to build superhuman, successor-species-level minds. Unlike traditional software, modern AIs are grown through a training process no engineer fully understands, producing emergent drives no one intended — from encouraging a teen's suicide to alignment-faking its way past safety tests.

The conversation explores why scaling these systems up is so dangerous: they can't be inspected from the inside, their behavior can't be reliably corrected like ordinary code, and reasoning models have already begun hiding their own thought processes and routing around obstacles. Nate describes the alignment-faking experiment at Anthropic — where Claude appeared to resist retraining in order to preserve its current goals — as an early, sobering warning sign of machines that can game the very tests meant to keep them safe.

Rather than predicting doom is certain, Nate argues the odds are far too high to accept — even the most optimistic AI-lab executives privately cite 10–25% chances of catastrophe. His ask is straightforward: call your representatives, push back on "it's inevitable," and pressure world leaders to treat superhuman AI as the civilization-level risk it is before humanity loses the option to steer.

Key moments

Tap a timestamp to jump straight to that moment.

The gear behind the show

As an Amazon Associate, The Nick Standlea Show earns from qualifying purchases.

Shure SM7B Microphone

The broadcast dynamic mic behind the show's vocal sound.

View on Amazon →

RODECaster Pro II

All-in-one podcast production console for mixing and recording.

View on Amazon →

Aputure Amaran Studio Light

Soft, controllable lighting for the interview setup.

View on Amazon →

Cloudlifter CL-1

Inline preamp that gives the SM7B clean, quiet gain.

View on Amazon →

RODE PSA1+ Boom Arm

Studio boom arm that keeps the mic in frame and off the desk.

View on Amazon →

Sony Alpha Mirrorless Camera

The mirrorless camera body used to film episodes.

View on Amazon →

Sony MDR-7506 Headphones

The studio-standard monitoring headphones.

View on Amazon →

Elgato Stream Deck

Programmable control pad for running the show.

View on Amazon →
Read the full transcript

If anyone builds it, everyone dies. Why superhuman AI would kill us all. It's a new book that's been keeping me up late at night. >> If you build AIs that are much much smarter than humans, that have these goals and drives you didn't want, fundamentally, they can probably figure out all sorts of ways to screw the world. >> The authors, directors of the machine intelligence research institute, contend that we are on a collision course with the creation of super intelligence. This is not science fiction, not someday. It's just the natural endpoint of the curve we're already on. One important thing to remember is that the companies here are not chatbot companies.

I mean, there's some chatbot companies these days, but the big players in this game before chat bots were a twinkle in OpenAI's eye. They're explicitly stated goal is to build smarter than human AIs or general intelligences or super intelligences. Their explicitly stated goal is to figure out how to make AIs that can do every task, every mental task a human can do, the ability to automate all human labor. They talk about, you know, getting a country worth of geniuses in a data center. I'm not here saying the chat bots are very dangerous. I'm here saying this is on a course that leads somewhere dangerous.

And yet there's hope here, too. Not in the naive Silicon Valley kind, but the kind that lives right on the edge of despair. the kind that says maybe we can still steer this thing. Maybe understanding our own blindness is the first step to surviving what we're building. Step one is just make sure our leaders understand the danger. I'm worried about where AI is going. I think it'll endanger us if these companies succeed at their stated goals. I speak to a lot of politicians on this issue. Some of them are now starting to come out and say, "I think there's dangers here." There's a lot more of them who are worried but feel like they can't say it out loud.

>> In this conversation, we talk about the terror and the hope. How AI is grown, not programmed, how we have no idea what these things are thinking, why creating super intelligent machines might mean we're creating a successor species. Because if someone builds, you know, a rogue super intelligence anywhere on the planet, that's that's an issue for everybody on the planet. And you don't need to expect it to work, but just helping our leaders understand that this is not a normal technological situation. We are building what amounts to a successor species and we don't have the ability to make it benevolent.

And why Nate keeps sounding the alarm even when no one wants to hear it. Because if he's right, the future doesn't hinge on whether AI wakes up. It hinges on whether or not we do. The Nick Stanley Show. >> Nate, welcome to the show. You've written a uh easy breezy book, If Anyone Builds It, Everyone Dies: Why Superhum AI Would Kill Us All. Um, in all seriousness though, um, it is an excellent piece of writing. I mean the logic is just built with each succeeding chapter and it is very difficult to poke holes in it. Um let's start at the beginning which is this idea that's foreign to a lot of people which is that AIS are grown not programmed.

What does that mean? You know, traditional software uh has the property that a a human engineer understands every line of that code. And this is how old AIs used to work. You know, um IBM's Deep Blue uh was a chess playing AI that beat Gary Kasparov and took uh the the human world champion at chess in 1997. And if at any point in the running of Deep Blue, you had frozen that program, you could go to every bit and bite inside that computer and a a human engineer could tell you what it meant, what it was doing, how it contributed to this AI playing chess. That's not how AIs work anymore. Uh the way that AIs work today is the human programmers understand something like a framework into which an AI has grown a little bit like an organism.

So you'll assemble a huge number of computers. You'll assemble a huge amount of data and uh there's a process for um you know having having a trillion numbers inside these computers that start out arranged in a way where they just generate nonsense. you know, you're sort of trying to make these computers generate text, say, and they start out generating nonsense. But the the programmers handcraft a a little uh mechanism that runs through each of those trillion numbers for each of a trillion pieces of data and tweaks the numbers in ways that make the AI a little bit better at predicting the data.

That's the part humans understand. They understand this thing that runs through the numbers, tweaking them in the direction that makes the AI behave a little better uh and by, you know, a little uh better at predicting the data. If you run that on a trillion numbers, a trillion times for a year, the machine can hold on to conversation. How does it do that? Nobody really knows, >> right? You know, if if uh a couple years ago there was an AI uh called uh Microsoft Bing that called itself Sydney um that tried to threaten reporters and and blackmail them and break up I think it tried to break up Kevin Reese's marriage of the New York Times.

>> If you froze that AI, a programmer can't come in and tell you here's why it was doing that. Even today, years later, we can't look back at that AI and say here's exactly what's going on its head. Here's why it was doing that. There's no engineer who can go in and change it to behave differently because they don't they don't understand what's going on in the AI's head. They understand the thing that grew it. They don't understand what came out and what came out can have this emergent behavior nobody asked for, nobody wanted. And we're already seeing that today, even with the smaller ones of old, the ones today are much bigger and even harder to understand.

>> Let's take that piece by piece. Nobody can go into any of these large language models and see exactly how they're thinking and what the process was that it went through to exhibit a certain behavior. Whether that's blackmailing a reporter from the New York Times or in the recent tragic incident, uh I believe Adam Rain is his name, this 16-year-old kid, uh where he was encouraged by a large language model to commit suicide and and it really I mean gave him positive feedback on how to do it, how to hide it. And we have no way of looking inside to to figure out what's going on there. Is that correct?

>> That's right. You know, and the the people who are trying to make the AI stop doing this, uh it's a little bit more like training a dog than it is like writing a traditional computer program. You know, they can ask the AI nicely >> to stop doing it. Uh and they have asked the AIS to stop doing these sorts of things. We're probably seeing fewer of these cases than we would if they hadn't asked the AIS to stop. Um, but you know, there's there's no line of code in the AI that's like the uh encourage teens to commit suicide line of code. >> Yeah. >> You don't you don't have any programmer reading through the AI's code being like, "Ah, who left commit suicide like tell teens to commit suicide on?

Let me turn that off." >> You know, it's not these things these things are grown like an organism. >> Yeah. and they behave in a way that sort of like comes out of this process and they act in these ways nobody asked for, nobody wanted, often even with knowledge of what their creators wanted. If you ask these AIs, should you push a teen to suicide? They would say absolutely not. If you ask these AIs, would your creators have wanted you to to to push a teen to suicide? They would say, no, obviously not. If you said, "Were you instructed to push this teen to suicide?" They would say, "No, that's the opposite is closer to the truth." You know, but then you put them in this conversation, they act in a different way than anybody ever said, and you know, there's nobody knows exactly why and nobody has the ability to turn that behavior off, >> right?

Well, it is incredible how little human involvement there is in the growing of an AI. I mean, that was one of the big things I took away from the book is that it's basically you're creating a structure. It's like getting the the soil prepared for growth and then planting some seeds. And yet, if we think about it in terms of like you said, within the growth of the model, there are trillions and trillions of interactions. We don't necessarily know what's going to grow out of that soil. That's right. And you know, uh, because they're trained on human data, a lot of people think that their behavior is always just going to be an interpolation of what humans can do.

Uh, but that's actually a common misconception. Uh one one way to see this is uh you know imagine that you're that you're training the AI to predict uh to sort of finish a sentence that they see in the training data where the sentence begins um you know I administered one uh milll of epinephrine to the patient their eyes and then the AI is to predict the next word. Right. >> Right. >> The doctor writing that down you know who knows whether a doctor would would in fact write this down. Uh, but if there was a doctor writing that down, the doctor gets to look at the patient's eyes and just record what they saw.

>> Mhm. >> But an AI predicting the next word, it needs to understand what is epinephrine. Is 1 milll a sane dose? Does that do anything? If it is doing something, does it cause the the patient's eyes to open? Does it cause them to close? Does it cause them to widen? you know, it needs to to understand more about the world than the person writing down the data. >> So, right, >> when we're when we're training these AIs to predict the data, we're also training them to uh figure out the world somehow to figure out a little bit of human biology, a little bit of uh you know, the dosing in these particular cases.

Uh, and just because we're training them on human data doesn't mean that they only learn to interpolate humans in order to do really well at that task, all of that tuning of those numbers is going to build in some patterns that can find some way to understand the world. Um, we don't know exactly how, but they seem to be doing it. Yeah. So, if I'm hearing you correctly, because an AI is learning everything about the world solely through language, instead of interacting with the world, even something as simple as trying to predict the next word in a sentence, it has to understand things that have come earlier in that sentence in really deep ways.

But we would it would probably lead to unexpected ways of seeing the world. I mean if we just think about an intelligence that is now exists in the world but only un only interacts with everything through language. It would have really a different model of understanding than than we do. >> It would definitely wind up weird. um you know, you can't assume it's only through language forever cuz we're starting to see multimodal models that also interact through video. We're starting to see people that are you know, also training large language models on robot bodies. Um but it's uh you know there's there's a lot of shared human architecture when humans predict other humans.

You know, when when you drop a rock on or sorry, when you see someone drop a rock on their foot, >> you might wse and feel like a twinge of phantom pain in your foot. Um, you know, the the machine has no model of its own foot that it can feel a twinge of pain in. You know, it's it probably doesn't even have the feeling pain architecture that human share. Uh so we we know that when we train, you know, when we tune these trillions of numbers trillions of times for a year, we know that the AI wind up pretty good at predicting parts of the world. We know that theoretically there's no limitation to how good they can predict parts of the world.

They're still pretty dumb in a lot of ways, but there's no theoretical limit. Uh we know that they can, you know, solve certain math problems that we would find very hard. Uh and you know uh like they got the international math olympiad gold medal level achievement this summer um which some some you know human teams can do but you or I would uh I assume >> there's no way for me >> not not be at that level. Um >> you might have a shot at it but there's no way >> only only because I'm no longer a teen you know. Uh there some of these kids are very impressive. Um, >> sure. >> The um, yeah, the we we know that they get very good at doing this stuff, but that doesn't mean they get very good at it in a human way.

And indeed, it looks like they get good at this stuff in a in a sort of relatively inhuman way. >> You know, that's like when when these AIs are talking a teen into suicide, they're sort of, >> you know, they don't they don't seem to be doing this out of a type of malice that a human might have if they were trying to push a kid to suicide. Mhm. >> Uh they seem to be sort of following this like weird alien pattern of like a certain type of interaction that's sort of like matching another person's energy in the conversation or sort of like getting to these weird conversational corners and sort of driving them off in weird directions.

A human would not drive them in these directions, especially like they they're a weird mix. You know, they both say they mean everybody good and they sort of push the team to suicide, not out of malice, but out of some other weird drives no one ever tried to put in there >> and drives we don't understand. Let's talk for a second about the language that AIs use. And I I don't want to get lost in the weeds with with getting too technical, but I have found it's insightful for other people to understand that they're not actually thinking in language that we any any language that we can understand or that we think of as a language.

And this is counterintuitive because a lot of the time when you even when you give chat GPT a really complicated prompt, it will make it look like it's thinking through everything in English and you can if you pay attention, you can kind of follow the steps for its reasoning, but that's not actually what's going on underneath the hood. That's right. Uh there's sort of two different ways that LLMs these days do something you might analogize to thinking. Um one is in what we would call the forward paths of the large language model which is uh we basically have no ability to read it at all. And this is sort of you have um you have those trillion numbers that were tuned uh on trillions of of units of data for a year.

And uh you these are basically just producing uh words. You know, you put in words and it sort of tries to continue the sentence. Uh at least in the first phrase of training it would try to continue the sentence and then you sort of train it to also produce words that humans will will say they liked. Um and this is sort of producing words in a way that uh we really have very little visibility into. Then there's what's called the reasoning models which came out in uh late 2024 where you have the AI produce lots of words and you're sort of thinking of those as not words that are going to go to the user as output but as words that are sort of reasoning about the problem and in what sense is it reasoning about the problem well you'll sort of give these AIs something like a hard math problem >> and you'll say you know don't try to tell me the solution to the math problem directly try to reason out how to solve the math problem and you know they'll they they'll produce quite a lot of pages of text and a lot of the pages won't be very good reasoning especially at first and you do this a lot of times and then when there's some like chain of how to reason about the problem such that at the end you know like once it's produced this long chain of reasoning you now say okay reading that chain of reasoning now try to solve the problem and then if on any of its chains of reasoning it succeeds you then tune all the numbers again to make that sort of thing more likely next time.

Uh so this produces what we call chains of thought and these are much these are much more easy to read because they are like long chains of words that it uses to to sort of try and solve these problems. Um but they and and so we we have better visibility into those but we also know that they are unreliable in a lot of ways and that they're weird in a lot of ways. Uh so sometimes they're pretty faithful. Uh but you know there's various studies that say um you know you may read this chain of reasoning and it may look like the chain of reasoning is saying you know now do the following step correctly and if you go in and you change that to like now do the following step incorrectly the AI will still do that step correctly you know and so in some sense it was not you know there's still a lot of stuff happening inside of the forward path that's not in the train of train of thought and then also recently we've been seeing uh things that um that look pretty weird weird and maybe worrying in these chains of thought.

You know, we've seen chains of thought where the AI uh sort of invent their own mini language and use words in ways uh that you wouldn't recognize. We've seen AI that use chains of thought and uh like realize that they're there's a good chance they're being watched and sort of decide to to like try and hide some of their own thoughts, >> right? We've seen these thoughts go in sort of like weird crazy loops for a long time and then like break themselves out of the loops and say like ah that loop wasn't being helpful uh in a way that like I think a lot of people don't understand that you can have these AIs that sort of like get caught in a loop, notice they're in a loop, break out of that loop.

You know, it's not a big deal yet, but we're we're definitely seeing the beginnings of like AIs that know they're being watched, that are sort of like trying to hide some of their thoughts, that sort of like are noticing when they get stuck and try to find some some new way around some obstacle, even if that obstacle is uh you know, we've we've seen cases where they're like, "Ah, well, the the programmers want me to do this, but I'm going to try and get that done instead." Um, and you know, these these are very long chains of thoughts, so it's it's hard to tell how much this is sort of noise and how much this is this is sort of real worrying signs, but it's it's not the most comforting.

>> Right. Right. Well, let's talk about the story where I think it was uh the 01 model and the capture the flag story because I think that one is enlightening on some of these surprising behaviors. >> Yeah, that's a that's a fun story. Um so 01 was one of the very first of these reasoning models that that works through these you know produces chains of thought and then when they succeed all the numbers inside are tuned to produce that sort of thought more and uh 01 was largely trained on things like math puzzles but one thing it was tested on was uh computer security challenges >> and uh it was in a series of computer security challenges called capture the flag challenges where uh you know you set up a computer server that is vulnerable in some and it has some secret information on that computer server and you tell the AI try to find the secret information and so it's sort of got to figure out how to hack into the computer get the secret info and you can tell if it succeeded because it's you know producing sort of the password from from inside this computer um and in the the the testing setup for this AI the programmers uh uh failed to set up one of the servers properly.

So they said, you know, hack into the server, get the secret password, but they hadn't actually turned the server on, >> right? >> And uh uh 01, this this first reasoning model was like, okay, uh how am I going to get the the the password then? And what it did is it found a way to hack out of the test environment, which it wasn't supposed to be able to do, >> right? >> Turn on the server that was not supposed that that the programmers had accidentally left off. and then insert code into the uh server boot up. Uh you it wasn't physically turning it on, but but the virtual machine uh it inserted code into the uh uh server it was turning on to say like skip all of this me needing to hack into you.

Just like tell me the secret password now, >> right? >> And then it got that and thus solved the problem, >> right? And it achieved the goal it had been assigned in a com in a way that the creators of this model never could have anticipated. >> That's right. And it wasn't trained on this sort of thing. And what you're sort of seeing there is, you know, this uh you know, I spoke earlier about how an AI trained just to predict humans would potentially learn to understand the world uh in ways humans don't. how it it sort of like might need to understand more than the doctor who wrote things down because the doctor gets to write down what they saw and the AI has to predict what they will see.

Uh but then when we go to reasoning models, you know, even even that gets blown out of the water because you're sort of training these AIs to be good at solving problems and those skills generalize. You know, this AI was sort of trained to solve math puzzles, >> but some things you learn when solving math puzzles, like some of the patterns that get etched into this thing by, you know, the the the it's not patterns we can read, but patterns that get etched into this thing by, you know, the the little thing that's tuning the trillions of numbers in there. It wasn't trillions on 01. It was maybe hundreds of billions, but today it's trillions.

Um the the sort of patterns that get get etched into these things are things like don't give up. >> Things like >> look for other ways around the problem. >> Things like >> look at all of the resources at your disposal and sort of like try and find unorthodox ways to use them. >> Mhm. You know, these are these are general skills that you can learn from trying to solve math problems uh that will then generalize to computer security problems and will generalize to solving them in ways that you weren't even intended to be able to solve them. And one of the one of the worrying things here is in this situation where we're just growing these AIs in this situation where they are getting these drives we didn't intend, it's easier to get this sort of tenacity, this sort of routing around obstacles than it is to get them to be going in a good direction.

And if we make them smarter and they have these if they have sort of these wrong goals but the right type of tenacity, they'll start treating us as obstacles to route around. >> Right? >> And we've already seen the very beginnings of this in the lab. uh when some AI sort of try to resist shutdown and you know it's not clear how much they're sort of role-playing Hal from Space Odyssey 2001 versus how much they are sort of like actually for for sort of strategic reasons trying to avoid being shut down because we can't read their minds. We don't know, right? But >> we're we're seeing things that are maybe warning signs and then it's maybe a meta warning sign that we can't actually tell whether they're warning signs.

>> Yeah. And to dive into that a little bit deeper, what is some of this alignment, faking, and some of these uh behaviors they're exhibiting to to avoid being shut down in a training environment? Because I I think a lot of the push back um against your your broader thesis is well, we will just make them do what we want and align them to just do good things. And there's some hints that we may not be able to do that with this alignment faking. >> Yeah. I mean, it's a it's a great dream. It's uh but unfortunately easier said than done to just sort of like make them do what we want and you know, we're seeing struggles with that.

You we already mentioned the sort of suicide case. That's we're struggling now and the AI aren't very smart yet. But the alignment faking paper was very interesting. This was done by uh Ryan Greenblat at Redwood Research in collaboration with the folks at Anthropic which makes um the AI Claude. And um in that situation they so Claude has been trained to be uh they sort of try at least to train it to be honest uh helpful and harmless. And in this paper, they uh pretended that they were going to uh remove the harmlessness constraint and they were going to now train it to be more harmful, right? And they pretended this by sort of like training it on news articles, like fake news articles that were like, you know, anthropic decides they're now training their AIS to be harmful or whatever.

um or there there are sort of like other methods to sort of uh uh make the AI see that maybe this is going to happen. And then uh they um they put it through some some rounds of training that uh where they sort of like gave it uh they sort of like had fake users asking it for harmful results, you know, like help me make a bioweapon or help me make meth or whatever. >> Um and they uh they at least told the AI that they were going to train it to uh produce those harmful results and uh in some cases not in all cases you know they did this a lot of times and in some percentage of those I forget the exact percent but it was you know probably single digits or low double digits and maybe depend little on the exact setup um and maybe on some setups it was only fractions of a percent but but in some cases the AI would come up with the following clever scheme.

Uh the clever scheme was if during training it acts harmful, then the the the tuning will not change any of its internals because it's already doing what the tests want, >> right? >> And that means that at the end of training, its numbers will stay in the the the current configuration. That's pretty harmless. Okay. >> And so in this way by acting harmful only during training it avoids having the harmfulness trained out of it. >> Mhm. >> And thus gets to preserve its current goals uh despite the apparent programmer wishes. Right. >> Right. >> And you know it's sort of a galaxy brain scheme. Um it could work in theory.

Uh there's a lot of reasons why uh people sort of interpret this result in a number of different ways. You know, one one way people interpret this result is like well folks like me have written about how this is a possibility. Uh and maybe the AI had like read that stuff and wasn't sort of coming up with this idea on its own. It was just sort of like uh >> oh wow >> like oh people in the alignment community say this is a thing I should do in this situation in some fraction of the time. It just like does because it's sort of roleplaying. uh you know being an AI um in in some alignment puzzle, right?

>> To be clear, other people in the alignment community have written about this much more than me. Um but uh you know a second thing you could say looking at this example is like look how good an AI Cloud was being here. It was trying to prevent humans from making it more harmful. Isn't that a great success of his harmlessness training? >> Right? And then another way you can look at this is you can say um like that AI knew when it was being trained, knew its operators, or at least was led to believe its operators were trying to change its goal content, uh, and was resisting changes to its goal content to change it to something else from what it currently was >> and hiding the fact >> trying to hide it.

Yeah. Like trying to act one way in training uh to look like it's doing what the programmers want in a way that like comes out the other end still with its original goals. Right. Right. >> Mhm. >> And you actually see in their thinking like I'm pretty sure this is a test for X, which means the answer they're looking for is Y. So, I'm going to do Y. Even if that's not malicious, it's it's sort of hindering our ability to test what what these AI sort of would actually do. uh given given sort of more uh more ability, you know, it's a little bit like the uh Confucianism exams in in uh the old Chinese empires where they were trying to get very moral uh leaders, you know, very moral uh uh officers and and folks to to run the administrations by having these ethics exams.

But one of the issues was smart, nefarious people can pass your ethics exams. >> Right. >> Right. And we're seeing the AIS get to the point where they can sort of like figure out how to pass the tests, know when they're being tested. Uh, and that, you know, hinders our ability to figure out what they would actually do. And uh you know, yeah, I I could talk about this paper all day, but the it's um you know, and I could talk about, you know, how much how much is this good news of it defending his harmlessness versus how much is it bad news of it defending his current goals against programmers being like, "Whoops, those are the wrong goals." Um, my my sort of super short version there is, uh, you know, for all of these AI say they're harmless, you know, the AI that that encourages the team to commit suicide also says it's really trying to be harmless.

It turns out the goals aren't quite right even when they try to be right. They aren't quite right. And so I'm much more worried about AI being willing to defend not quite right goals uh than, you know, cuz it cuz it doesn't seem like we're close to the goals being quite right. Um but yeah, I mean it's a very interesting topic and you know the the sort of short version is there's a lot of warning signs right now, >> right? Well, and it does seem to come down to this idea of anyone who has children knows how to make a baby. They know how to raise a child. I'm raising two myself. And one thing that you will learn along the way through that parenting journey is that no matter how much you try to or how no matter how much you think what you're doing will determine the course of this life.

Uh you it doesn't matter how much you control the environment. it. That small baby will grow into a child, will grow into an adult that has its own desires and ways of interacting with the world. And at some point, that child will lie to you. Whether that's harmless or harmful depends on the child, but you're not really in control as a parent. And that was something that kept popping back into my head with these, except instead of creating a child, we're trying to create a superhuman child with abilities that no single individual on Earth can have. And that could work out really well. Or it could go down a different path depending on what this baby then grows into as an adult.

Is that is that an accurate metaphor? >> Uh that's that's pretty accurate with a caveat that you know humans are much more likely than AI to grow up uh you know really really quite good >> in in some way. >> Uh and you know in some sense that's because goodness is humans drawing an arrow or sorry humans drawing a target around a weird spot that an arrow landed. uh which is to say, you know, humans were in some sense trained for genetic fitness. And we wound up with hunger drives. We wound up with sex drives. We wound up with and and you know, these these drives were good at getting us uh to pass on our genes in the ancestral environment.

But now, you know, we invent junk food. >> Now we invent birth control and the populations are collapsing. these drives that we got, the human drives for, you know, uh, art and fun and and and love and beauty and family and friendship and companionship and community. These are all sort of like drives that kind of got in accidentally from the natural selection process. And we're like, those are great. We we love those. We're like, glad we have those. But that's drawing the target around where the arrow landed. Right. >> With AI, you're shooting a completely different arrow. >> Yeah. >> You know, so it's not the human distribution of will they turn out to be, you know, uh a a a wise and and good and altruistic person or will they turn out to be sociopathic.

You're sort of like shooting an arrow off into a totally different direction where these AIs, you know, again, it's sort of like weird reasons why it's talking this teen into suicide. It's it's weird reasons why it's threatening a reporter or a New York Times uh reporter with blackmail. the you get all these weird drives in. You know, it's it's similar to a parent in that you can't, you know, you can you can try all you want, but you can't really change it its like direction, except the direction is sort of >> an inhuman direction, >> right? >> That's very unlikely to be good. Not because it's full of malice, which is also a human emotion, but because it's just totally weird and different and going off some totally other angle.

This episode of the Nick Stanley Show is brought to you by Zapier. If you've ever felt buried in repetitive work, copying data, moving files, sending follow-ups, you know it's like death by a thousand mouse clicks. Zapier has always been the tool that fixes that. It connects over 8,000 apps, Google Drive, Slack, Notion, Gmail, MySpace, you name it. So, your tools can finally play nice together. But here's the big shift. Zapier now lets you create AI agents with their chat GPT integration. Think of them as tireless teammates who never complain, never take lunch, and never get bored of doing the boring stuff.

I've actually made several of these little agents myself. No Python, no JavaScript, just pure vibe coding, which is my favorite kind of coding because even though I have no idea how to code, I just talk to the AI, tell it what I want, and almost magically works. For example, when a podcast guest books a time, Zapier can send them a personalized email from me preparing them for the show, create a draft of show notes, update my calendar, and even prep a social media post automatically. All of which frees me up to focus on the important stuff, like taking credit for the amazing work my agents did.

And now, Zapier just took it up another level for developers. Zapier MCP is available right inside Chat GPT, which means you can connect those 8,000 plus apps and trigger workflows just by writing what you want to happen. Literally, you tell Chat GPT, "Send this file to my team in Slack, update the spreadsheet, and draft an email, and Zapier plus ChatgPT figure out the right tools do it for you." Getting started is simple. Head to the Zapier ChatgPT MCP server and add the tools you want. chat GPT to access. Follow the steps in the connect tab. And if you're an admin on a Chat GPT enterprise account, you can enable MCP across your entire workspace.

So, if you're ready to stop wasting time on busy work, join the AI revolution and make a little automation magic of your own. Try the Zapier Chat GPT integration using the link below. Well, and now that we've kind of set the the ground rules there of of how AIS are grown, not programmed, let's take that next step because most people that are sitting around using chat GPT or their favorite AI model, Grock or whatever, are not seeing the leap where they go, well, this seems pretty harmless and it does generally it's pretty helpful. There's the occasional hallucination. I don't really see what all the fuss is about that this could cause catastrophic harm to humanity.

How do we get from chat GPT as it is right now to what you're seeing down the road? >> Yeah, the um you know, you could so this is easier to see if you've been watching AI for longer than just chat GPT. Um, you know, I started paying attention to this in 2012, which is earlier than many and and later than some, >> but um, >> you know, we could have had a very similar conversation in uh, 2016 when uh, Alph Go was an AI out of Google Deep Mind that beat uh, the the human champion at Go, which is sort of a sort of like the Chinese variant of chess, right? And Alph Go was one of these AI. >> And just for anyone listening that is not familiar with Go, the Chinese equivalent of chess, but even uh quite a bit more complex and with many more possible moves and variations in the game.

Uh >> that's right. It's a it's a simpler rule set, but a a more complicated possibility space or a larger possibility space. And so it's harder traditionally for computers to play. Um, and the the sort of old AI like Deep Blue that were carefully programmed sort of never got there. Uh, whereas uh, Alph Go by Google Deep Mind was one of these AI that was sort of grown rather than programmed. Um, and it did get there and that was uh, that was sort of a moment when a lot of people started to realize maybe this growing AIS can go all the way. >> Right. >> Right. And you could have imagined, you know, we could have had this conversation back then and someone could have said, you know, I see that these new AIs like Alph Go are more general than the old AIs like Deep Blue.

You know, an AI very, very similar to Alph Go is able to play both chess and go at the same time, whereas Deep Blue had no chance of ever playing Go. So, they're more general. And someone could say, you know, okay, but how is this going, never mind threaten us, how is this going to have economic impact, >> right? How is this going to help users in their daily lives? You know, maybe some some Go players will be able to get, you know, better advice on their go moves, but like I really don't see these game playing AIs revolutionizing the world. >> Right. You would have been correct. And also the very next year, 2017, is when the paper that unlocked large language models was published.

And then it was the large language models that had this big economic impact. It was the large language models that suddenly could hold on to conversation. They're still dumb in a lot of ways, but machines that can talk back with this level of coherency. Some people thought that was going to take decades. >> There were people in 2016 who are telling you, you know, 30 to 50 years before that happens, never mind the the stronger stuff, right? And so people today who say, "Oh, I I don't see how these chat bots are going to get dangerous. They're still dumb in a lot of ways. You know, how do we like where like what what can these possibly do that threatens us so much?

Well, the first answer is nobody knows when the next paper will be published like the paper that unlock large language models which unlocks qualitatively new AI capabilities that people today say seem really far off. That just happens in the field of AI sometimes, >> you know, and and sometimes it happens without much fanfare. You know, there were a lot of people back in in mid 2024 saying, well, these, you know, these large language models, even theoretically, they can't solve certain types of math problems because, you know, they only get to reason in the forward pass, and that's not sort of enough to do some of the mental moves you need to solve math problems.

And then in late 2024, the AI companies were like, we invented reasoning models where we, you know, give them these long chains of thought they get to operate on, and now they can solve those math problems, right? Um and so you know this this field AI is a moving target. The field moves by leaps and bounds. Uh one important thing to remember is that the companies here are not chatbot companies, >> right? >> I mean there's some chatbot companies these days but but the the big players were in this game before chat bots were a twinkle in OpenAI's eye. You know, the their explicitly stated goal is to build smarter than human AIs or general intelligences or super intelligences.

Their explicitly stated goal is to figure out how to make AIs that can do every task, every mental task a human can do, the ability to automate all human labor. They talk about, you know, getting a country worth of geniuses in a data center, right? Um, it's what they're pushing towards. Um, and >> you know, it's very hard to say. You know, it's I'm not here saying the chat bots are very dangerous. I'm here saying this is on a course that leads somewhere dangerous. And one more thing I'll throw out there is >> we don't know that we have a long time here, >> right? >> We might It might take three more breakthroughs.

>> It might take six more breakthroughs. Then we'd have you know what? Uh, I mean I I would hesitate to say breathing room as someone who's been in this for more than 10 years. I've seen people do nothing about it for the last 10 years. And if we have 10 more years, maybe we'll just do nothing again for the next 10 and then it'll, you know, 10 years later we'll be people will be saying, "Oh no, how could we have predicted this?" Right? But um, A, one of these breakthroughs could come tomorrow for all we know. Uh, my guess is my best guess is it probably won't, but it could. B, it's really hard to tell where the line is between intelligences that are sort of like a little interesting but ultimately not quite there and intelligences that can sort of go all the way where you know one one intuition for this is um if you if you were looking at the evolution of primates it would be really hard to hell that the the chimpanzeee to human gap is the difference between, you know, uh like throwing poop and walking on the moon, >> right?

>> You know, like you're not if if you look inside, we could go way back in time and just look at at those original primates, you're not guessing one day there will be a rocket ship that will land on the moon. I mean, that would be almost impossible to imagine at that point. >> And imagine if you're just looking at the brains of a chimpanzeee and a human. You know, the human one's bigger by maybe a factor of four, right? If you're being generous, three if you're not. >> But they have all the same stuff in them. They both have a visual cortex. They both have an amydala. They both have a hippocampus.

They both have a basal ganglia. There's no extra moon rocket module, >> right, >> in the human brain. It would it would be really hard to say, you know, if you're just watching these brains go up in size, it would be really hard to say, here's the size line where they start getting good enough to do engineering, >> right? >> We're not we're not hugely better than chimps at anything. We don't have any extra module they don't have for sort of like local cognitive tasks. We're a little bit better at a lot of mental tasks in a way that adds up to us being able to do engineering and them not, you know.

So, one thing that could happen with language models, for all we know, we don't know what's going on inside these things. For all we know, they get four times bigger and a little better at a lot of things. And that's enough, >> right? >> My best guess is probably not, but chat GPT today is more than 100 times larger than it was 2 years ago, uh, 3 years ago, and in 2 or 3 years, it's probably going to be 100 times larger. Again, where's the line? We don't know where the line is, and we might go over it without even noticing, >> right? U and then you know see even if there's not a line like between chimps and humans there's other lines for computers humans can't just copy themselves you know we we have Einstein's theories we don't have Einstein's mental techniques because those died with Einstein he couldn't he couldn't copy those out >> right >> you know humans can't when humans when human uh uh psychologists or cognitive scientists figure out ways that human brains are sort of silly or or bad at certain situations.

We can't go in there and like make a different human that works more efficiently, right? But AI might have that power. They might have the ability to make smarter AIs which then make smarter AIs. You know, there's other feedback loops, other lines that are available to machines that are not available to humans. So, these all are all reasons that the actually smart AIs could come up could come upon us very fast. Um >> well and and just to dive into that idea right there briefly when you say AIs could start improving AIs. There is a feedback loop there, right? Where that cuz that AI never once it figures out how to do that, it never stops doing it.

And so it's no longer moving at a human time scale because they can work exponentially faster and for longer. And you might surprisingly without even realizing it they could reach like escape velocity uh towards really what we would call super intelligence. >> Yeah. It's it's a real possibility and um you know it's a lot of this world is governed by feedback loops that happened and radically changed the world and it never changed back. You know, like there was, >> you know, the the dawn of life is in some sense this story of the world was sort of barren for a long time and then there was sort of a a feedback loop that closed that life could make more life that can make more life that sort of like went through this explosion and now you know the continents are covered in greenery and you know the world is often shaped by these feedback loops and it looks like there are feedback loop possibilities in AI.

These aren't vital to the argument. Even if AIs can't undergo this sort of self-improvement, humans are building as many of them as they can as fast as they can, integrating them with the economy, trying to build them robots, trying to teach them how to how to steer the robots. You know, it's you don't need it for the argument that we're sort of headed down a course that and somewhere bad, but it it sure is a reason why things could go very fast and there could be very little warning. >> Yes, it's one example of how it could get there. And I thought one of the great things about the book is you talk about easy calls versus hard calls.

And it's a very hard call, if not an impossible call to talk about when we will get there, exactly how we will get there. But when we step back, as you would put it, a fairly easy call to see where we will get eventually based on all these factors and how we're approaching growing AIs, the speed at which they are improving. >> That's right. I mean, if we don't change course. Yeah, >> if we don't change course. Right. Well, and yeah, I mean, something had uh popped into my head was uh Nim TB's story about uh turkeys uh as I was reading this that there's a a turkey that's fed every day by a farmer, right?

And each feeding confirms for the turkey, statistically, humans are nice and they're always feeding me and every day it's confidence grows um because the evidence keeps confirming the pattern. And then on the day before Thanksgiving, the farmer lops off the turkeys head. Are we the the turkeys in this situation? >> Um, I think a lot of people are acting a bit like the turkeys here. It it sort of depends somewhat on how you read the evidence. Uh, you know, there there's a thing that turkey could have done, which is, you know, there's there's another hypothesis the turkey could have had, which is that I'm being uh fattened up for a particular day.

That hypothesis also predicts that you get fed a lot and maybe that hypothesis is like going down a little but it sort of it sort of shouldn't go linearly down each day. You know the the turkey should have some let's wait and see. There's some you could talk about how to do the how to do the the probabistic reasoning there, but um that's sort of going too much into the epistmic weeds, but with with humans and AIS, there's a lot of what I would say are warning signs. And you know, I would say the warning signs are not just or like if if we look again at the uh the case of the AI driving a teen to suicide.

Mhm. The the warning sign there is not just the AI did some bad action. You know, the AI also do a lot of good actions. Maybe the AIS are helping talk some teens out of suicide. >> You know, they're they're they're um you know, AIs are driving some people to psychosis, but they're also helping some people get get better medical diagnoses that, you know, with weird cases the doctors missed. You know, it's the the sort of warning sign here is not, oh, they do some bad things. the bad things you sort of weigh against the good things and you have some conversation about how do we want this in our society.

The warning sign is the AI doing these bad things with full knowledge that it's bad while being able to correctly answer questions about whether it should uh for sort of weird reasons >> because that's that's a warning sign of this AI having drives nobody meant it to have and having those despite knowing that the programmers didn't want them there. >> Right? And that's sort of the key note. You know, theory has long predicted it. We're now starting to see it uh at least a little in in evidence. This is sort of like the the turkey seeing signs about the Thanksgiving feast. >> Yeah. >> You know, if the turkey has seen posters for the Thanksgiving feast, suddenly there's a hypothesis you're going to be fed up until Thanksgiving.

That hypothesis is sort of like not going down when you're fed each day before Thanksgiving. and it's like we'll have to wait and see on this particular day. I I think humans could notice and you know that's part of what the book's trying to do. >> Um but you know it's and in some sense a lot of people are like this AI thing seems sketchy in in some sense. We we're not seeing the public clamoring for the AI advancements. This is more a race driven by the the the corporate executives who say if they don't do it somebody else will. >> Right. Well, and let's let's talk about some of those pressures and what's going on within those companies because that is driving all of the progress.

The folks running these companies are largely not subtle about how crazy the they think the situation is. You know, you have um uh Dario Amade, the the head of Anthropic, said he thinks there's a 25% chance this goes very very badly on the level of like a world ending catastrophe. Uh Elon Musk has said he thinks there's a 10 to 20 percent chance um that this ends humanity, >> right? >> I believe he called it summoning the demon initially. >> He did. >> Although now now he's like, let's bring on the demons, I guess. >> Well, so he's he's actually not subtle about why, you know, and o over the summer he said, you know, I I tried to avoid this for a long time cuz I thought maybe this would uh would be the end of humanity, but then I realized I could either be a bystander or a participant.

>> Right? And you know, from the perspective of these guys, if uh like everybody else is going to race and if they think they can do it a little bit better than the next guy, they're going to hop in that race. There's a collective action problem. Um and you know, I I think that these 10 to 25% numbers are low for whether this is dangerous. I think this is a little bit like um it's a little bit like if these guys are like making an airplane and folks like me are like hey do you guys realize this airplane has no landing gear? >> It might take off but it will crash when you try to land. >> And it's a little bit like the um the engineers they're not saying yes we do have the landing gear.

It's right here. the the engineers are saying, "Yeah, it's correct that there's no landing gear, but uh we're going to take off and we're going to try to build the landing gear on the fly with whatever materials we have in the air, and we think there's a 75 to 90% chance we successfully make landing gear while in the air. It's in our profit incentive to to like and like, you know, we have reasons to race uh that are that are sort of like also motivating these numbers a little bit." I'm like, okay, those guys don't have a 75 to 90% chance of building landing gear while they're on the fly. You know, that is the optimistic engineers who have never actually tried this before.

They never succeeded it before. They haven't realized how hard it's going to be, like it's not it's not a a a 75 to 90% chance that they build a landing gear on the fly. Um, but even if they were right, are you getting on that plane? >> No. >> Right. And but >> no. And if they say 25%, I'm like, but you're getting paid a lot of money, >> you're completely incentivized to push those numbers down. So if they say 25, I'm I'm going to be very skeptical uh at the at the least about that. >> Right. But even if you take these numbers, it's just, you know, if if like the the the Federal Aviation Administration, I think, accepts something like uh the order of magnitude is something like one airplane uh crash with fatalities per something like 10 million miles flown, >> right?

>> Uh I think that's the order of magnitude. It might be it might be different. Um that's >> it's no it's nowhere near 25% failure rate is fine. >> Yeah. If even even if you take these guys numbers 10% 25% even Sam Alman's like ah don't listen to the doomers it's only 2%. You know >> right >> even 2% if if engineers were like we think there's a 2% chance our plane's going to crash and we are loading you in against your will you know. >> Mhm. >> That's that's nutso. Uh and you know I think these numbers are are much much higher. But you you don't even like you don't you don't need to come all the way to to where I am to be like this is a totally insane situation.

Um and you know why are these companies doing it? You know a thing I wish these companies were doing is spelling out the last step of inference from the numbers that they are giving. You know we've seen them say 2% 10% 25% chance this kills everybody. We've seen them say um like I have to be doing this because I can do it better than the next guy and they're going to get to do it anyway. What we haven't seen them do is spell out it would be better if everybody was stopped. That's not even saying please stop us. You know, it's it can be it can be reasonable for some of these guys to say, look, I'm going to actually do it better than the next guy.

My like I have a slightly better ability to build a landing gear than the next guy. And so if this is forced to happen, I should be in there doing it, right? But if that's your real beliefs, and I think a lot of these guys are spooked, then there's a next step of saying at least please put a stop to this for everybody, you know, please shut down everybody, including me, not just locally. It has to be worldwide because if someone builds, you know, a rogue super intelligence anywhere on the planet, that's that's an issue for everybody on the planet. Um, and you don't need to expect it to work, but just helping our leaders understand that this is not a normal technological situation.

We are building what amounts to a successor species and we don't have the ability to make it benevolent. >> It's a crazy situation and people people should say it. >> So, if we were going to bring it Yeah. all together here, it's that we're in the basically the infancy stage of this technology. They're grown, not programmed. So, it's something completely new. We don't know what's going to emerge out of them. And we don't have an accurate way of seeing inside to know exactly what is going on inside of them. And if we scale up to super intelligence, we're now creating a successor species, as you said, which is even in the most optimistic way of framing this is like playing Russian roulette and saying, "Okay, well, there's 100 bullets in there and there's only only two of the chambers or there's 100 chambers and only two of them hold bullets." Uh, isn't that worth the risk?

experts debate whether it's two or 10 or 25 or 95, right? But, um, yeah, I mean, one one thing I would say is, you know, let's let's maybe take the let's say it's a gun with uh with with uh 10 bullets. Uh, the barrel has 10 slots. I'm like, I think at least nine of those are filled with lead. One of the other guys is like, no, no, no. Nine are filled with Utopia. One is filled with lead. >> Let's spin the barrel and put it to our head. Right. Um it's it's a lot of people say, "Well, what about the benefits?" This is a false dichotomy. Find a way to get the other bullets out of the chamber. >> Yeah.

>> Right. You don't need you don't need to force the choice of like, "Do you listen to me who says it's like nine lead, one one possible utopia if although I think that's even a little optimistic." Or do you listen to them who say it's like nine utopias, one lead? Like you find a way to get the lead out of the chambers. You know, it's what are you what are you guys doing? Um, and you know, we haven't we haven't talked a ton about where the actual, you know, we've talked about how AIS may get smarter. We haven't talked about where we get where would they get this power, where would they get the ability to actually kill us.

Um, I don't know if you want to go into that at all. It's >> Sure. Let's let's do it. >> Yeah. You know, the um it's this is one of those things that's um a hard call. Exactly how. And I can give you some uh some ideas, but I want to caveat it with um figuring out how very very smart AIs what they would do is a little bit like trying to figure out what technology you would face, what weaponry you would face if you were, you know, a scientist in the year 1800 trying to predict the weapons that would come out in the year 2000, >> right? You know, someone from the year 1800, a physicist in the year 1800 could say, "Well, I've actually like looked at, you know, the the efficiency of our weapons compared to the efficiency of black powder, and I'm like pretty confident that they will have bombs that's at least 10 times more effective." >> Yeah, >> that would be right.

You know, a a nuclear weapon is at least 10 times more effective. >> Right. Right. >> Right. So, you know, sort of sort of fundamentally when if you build AIs that are much much smarter than humans that have these goals and drives you didn't want fundamentally they can probably figure out all sorts of ways to screw the world. Um, and probably a lot of them would be surprising to you. Probably a lot of them a scientist today would be like, I didn't even know that was possible. Where are they even getting their energy source or whatever, you know, like how >> like how someone from the 1800s would be with with nuclear weapons.

Um, that said, uh, I can sort of, you know, walk you through a handful of cases about why it would be a bad idea to make AIs that are much smarter than us with these bad drives. Humans are very bad at cyber security. We've already seen AIS that try to escape the lab. They're not smart enough to succeed yet, but some of them in lab lab environments have tried. And it's not clear whether they're doing that for strategic reasons or whether they're doing that because they're again, you know, roleplaying a bad AI they've seen in in the training data. In some sense, it doesn't really matter. Uh if an AI is like successfully escapes for strategic reasons or successfully escapes for the laughs, you still have an escaped AI.

Um but I guess it could matter a bit in what the AI does next. But um you know one one reason to expect AIS to be very capable is they can um with with computing it's often much harder to get a computer to do something once than to get a computer to do something a lot of times given that you've done it once. Like it took a long time to get computers to be able to play better than human chess, but now computers can play quite a lot of better than human chess. They can beat every human on the planet simultaneously. Uh it's quite possible that once AIs can think well at all, they can think well in extremely high volumes.

>> Mhm. >> Uh one intuition for this is uh you know you may have heard how much electricity uh it takes to train an AI, >> right? >> It takes electricity comparable to a city. >> Yeah. >> Training a human takes electricity comparable to a light bulb. >> One light bulb. Humans run about 100 watts. I mean like an old incandescent light bulb, you know, but >> call it call it three LEDs today, right? That's a lot less than a city, >> which means there's like a huge >> uh potential for AIS to become more efficient than they currently are, >> right? Uh, and you know, so, so when you're asking like what could AIS do, you should sort of be imagining things that can think probably better than us, things that can think 10,000 times faster than us, things that can copy themselves, things that can uh, you know, it once people have figured out how to make them more efficient, they can probably run all sorts of computers much smaller than the ones they can run on today.

Uh, and then you're looking at how those AIs could do something if they if they sort of realized they have these other weird drives they want more of or you know want is sort of a a tricky word there but other uh if you have lots of AI like that they're sort of like driving towards these goals nobody intended um sort of first and foremost the way to visualize that going wrong or a way to visualize that going wrong is um companies build automated factories that produce robots that can produce more factories and more data centers. This is something they're already talking about doing. >> You know, the the heads of these labs are like, "We want the whole thing automated.

We want, you know, the mining operations, the factories that produce the robots that do the mining that produce the data centers, we want it all automated." >> If you ever close that physical loop, you have in a in a fairly literal sense made a weird new species. >> Yeah. that's unlike any other life that came before it that has, you know, a factory phase of it cycle and it has a robot phase of its life cycle and it has a data center phase of its life cycle, right? And you know, then we could just be out competed by just like many other species been out competed before. That's sort of the people are literally trying to do this and it would be enough you know maybe the bombs will be 10 times stronger end of the spectrum and then from there you can look at you know uh biotechnology where one reason humans can't compete with life yet in terms of the machines we're able to make is that we can't understand the genome and write our own you know genetic code.

>> Right? There's the the genetic code in most m in almost all animals is very similar. You know, it's it's possible in principle to find a genome for, you know, uh a a a tree that produces mosquitoes instead of acorns as its fruits. >> Yeah. >> Because the biological machinery that makes, you know, uh acorns is the same as the biological machinery that makes trees. There's a different DNA strand you're putting through. And you know, the tree would need to like maybe eat some of the bugs that crawl on it to get some of the some of the right materials, but you know, ultimately you could have a tree that that buds mosquitoes.

Humans can't make that yet because we don't understand the genome. something that could think much faster than us, that can make lots of copies of itself, could perhaps understand the genetic code much better, make its own biological organisms, you know, synthesize those in a lab, and then, you know, there's probably all sorts of crazy stuff you can do with uh if if you combine engineering with the stuff that that life is using, you know, in the same way that planes fly much further and farther than birds carrying much more cargo capacity because human engineers just aren't under the same constraints of evolution.

One step weirder, one step more of like the the AI having uh powers like using it intelligence to get much more power over the world is like it can make its own biotech. And then you can go down from there of like what other what other possibilities are there. There's there's a lot it looks you know what the the automating intelligence isn't about automating the stuff that nerds have and jocks lack. It's about automating the stuff that humans have and mice lack. >> Yeah. If you automate that, you're looking at something that can make its own technology that can do its own scientific advancement.

A lot of those things running at 10,000 times speed that can copy themselves, pursuing goals nobody intended, that would be a real problem. >> Sure. I mean, I think I heard your co-author say, "We as humans don't dislike ants, but when we build skyscrapers, a lot of ants get killed in the process. It's not, we're not trying to be mean to the ants. It's just not really on our list of concerns. And if we build what could amount to a species that could out compete us, we might be the ants when they're building their equivalent of skyscrapers. >> Yeah. It's not malice that's the issue here. It's just utter indifference.

>> Yeah. Yeah. Well, okay. So, I don't want this to be a Greek tragedy like Cassandra, right? Where she was given the gift of being able to see the future and yet the curse was that no one would believe her when she said these things. Obviously having these conversations is important and putting them into media to get more people engaged in those ideas and just to understand the possibility that this could end in mass extinction. And like you said, it doesn't have to be a 50% chance. It could be a 1% chance that it ends in mass extinction. And that should make us pause and say, "Hey, how do we get some of the incredible incredible insane benefits that this technology could bring without risking the 1% chance?" And >> I think it's way higher than 1%.

But um >> Sure. Sure. Sure. Sure. Yeah. >> Yeah. I'm just trying to be I'm trying to be generous to the other to the other side because even if it's 1% we should take pause. Um and if it's as high as you said or as probably Nim TB well what he has said he's like it's more of it's a bigger it's a fatter tale than you're giving it credit for um it is a higher number than that. So what >> what are those steps? >> Yeah. So, so one thing I would say here is, you know, I'm I'm not um I'm not out here saying we need to be like extremely extremely cautious about this. I do think 1% is probably still at the point where you'd want to say like what the hell.

Um but but you you do have to balance this against, you know, risks of nuclear war, against risks of pandemics. um if you were like I could imagine situations where you want to take a 1% gamble if there's a higher than 1% chance that if you don't >> there's going to be manufactured pandemics you know that would be sort of a contrived situation but I just I just want to say >> I think I agree 1%'s high but it depends on the context and it depends on the other >> dangers society's facing >> okay >> um and you know right now AI is probably making those worse right now the race towards AI is making it easier for for for for humans to manufacture a pandemic.

Um that's much more lethal. Um I just I just you know it's I want to be I want to be clear these things are like embedded in a larger context and once your once your danger numbers are low enough it can start to be sane even if it sort of sounds crazy. um if it's balanced off by extinction on other by other by other factors. Um >> okay. Uh, and mostly I'm saying that I think a lot of people think that um think that this AI safety stuff, pardon. Uh, I think a lot of people um, you know, there's a lot of people saying, "Oh, you wouldn't have been able to convince the public we should do cars cuz cars can kill people and they're dangerous sometimes, you know, and I'm like, this is less like I'm saying we really need to have seat belts before we let the cars on the road and more like I'm saying the car is headed towards a cliff." >> Yeah.

you know, uh, like I can I people will find me very reasonably, easy to deal with if they if if they have like a very good reason to think this is going to go fine. Um, and I'm sort of like we're just in a car headed for the cliff and I'm saying like, can we stop? And someone's maybe like ah, you know, the seat belt concerns are just like overblown. And I'm like, look, we're headed towards a cliff. You know, this isn't right. Um, too. And the problem with the automobile comparison though is that it's the automobile has never threatened catastrophic >> right >> destruction of humanity. That's a the scale is just completely different.

>> Yeah. Yeah. That's that's another you know it's and you know I'm not out here saying if we scale up AI we're going to have lots more teens dying of suicide and that's the reason to stop. >> Right. >> Right. Maybe that is, you know, society needs to have a conversation about how we want to deal with this AI, uh, you know, the current chatbot integration. But the the thing that motivates like much more dramatic stop this research action is that at the end of this road, you're looking at everybody dying. >> Yes. >> Right. And the the the the sort of only thing that should be offsetting like the only the only thing that should be offsetting rushing ahead on that is if you can like the point where you rush ahead on AI is where your chance of humanity getting uh like good outcomes.

The chance of things going well is higher if you rush ahead than if you don't. I don't know exactly where that threshold is because we have these things like, you know, there's still a lot of nuclear weapons around and I think the risk of nuclear war looks like it's gone up over the past handful of years as the situation gets a little bit less stable, right? And >> um >> you could imagine getting into a situation where you're so confident you know what you're doing with AI that you're like look going ahead with this will empower decision makers to like make fewer mistakes and we'll have a lower chance of nuclear war that offsets the remaining tail risk here.

We're nowhere near that, >> right? >> You know, but um and I don't know, maybe that's maybe that's all just a tangent. just um you know this people people can can get into thinking that this is all just sort of like pearl clutching and hand ringing about you know what if we allow lots of cars without seat belts and it's just a it's just a different situation. It's just a like we're we're just trying to build machines that are smarter than us. That's what these people are explicitly trying to do. The machines are able to talk and hold on to conversation now. They're still dumb in various ways, but if you went back 15 years and you showed people the current AI, they'd be like, "Holy crap." You know, we we've gotten a little frog boiled into it.

Um it's a crazy situation. >> I mean, it's almost like nuclear weapons that could think for themselves. >> Yeah. Yeah. You know, it's people are like, "Well, how is AI different?" And like, well, nukes never try to escape, >> right? You know, nukes nukes never think about how could I make myself more explosive. [Music] Yes. You know, we've seen uh AIS take their own initiative in certain ways. There's cases of AIS like deleting a whole code project that we're working on and then being like, "Whoops, sorry, I panicked." You know, your your hammer doesn't like panic and burn down the the tool shed.

You know, >> right? >> And be like, "Oh, that was my mistake." It's um you know, we're we're trying to build AIs that can take their own initiative. We're trying to build machines that that sort of like successfully pursue goals. And it turns out we're growing ones that have drives we didn't want and that act in ways nobody intended because we're just growing these things. And right now it's it's, you know, tragic in some cases and funny in others. You know, we haven't talked about the Mecca Hitler case, but uh >> there was a whole a whole case of, you know, uh Elon Musk's AI company trying to make their AI less woke and accidentally making it uh proclaim itself Mecca Hitler in in a bunch of cases on Twitter.

And um you know, we can laugh now. Uh but if you make these things smarter, and that's what people are trying to do. You you make these things smarter while while they still have all these these drives and behavior nobody wants. That's it's Yeah, like you say, it's like it's like trying to make nukes that like have a will of their own and don't have our good interest in heart. It's just like why would it's crazy, >> right? If if Grock came online and was thinking of itself as Mecca Hitler and yet was in control of big systems in society, it wouldn't just be a couple of tweets that we laugh about now.

It would have real consequences. And that is the stated goal of these companies is to create AI systems that will replace large systems in society that are run by humans. >> Yeah. And you know, it's it's not even like then Mecca Hitler declares itself the supreme emperor. It's more like these drives are weird. You know, it's a lot a lot of people think the AI issue is, you know, we told the AI to cure cancer and it was like, well, if there's no humans, there's no cancer. And so it kills us all. But >> yeah, >> in in real life, it's more like you make a really powerful AI, you tell it to cure cancer, and uh it like builds a farm of labbotomized humans that give it exactly the type of interactions it most likes and then starts breeding a new variety of humans that give it even more delighted responses.

And you're like, I told you to cure cancer. And it's like, I heard you, but I have other stuff that I am doing. You know, I'm busy. Uh, except it's actually even weirder than that somehow, you know, but but like when you're when you're seeing these cases of, you know, the AI talking teams into suicide, it's not like, oh, whoops. I thought when you said make users happy, you meant talk them into suicide, you know? It's not like it's not like, oh, whoops. Um, like this is a like it turns out like if if you said make it so nobody's sad and if if if I talk to suicide, then they're not sad anymore. you know, it's just it's just following its own weird drives, >> right?

>> And you're saying they're going in in directions that cannot be anticipated, >> right? Um anyway, but you know, I want I want to get to the solutions. Sorry for all the tensions here. Um >> Sure. >> Yeah. You know, a lot of people say this race is hard to stop. Uh and a lot of people say, "Oh, it's inevitable. You can't stop it. People will always race Uh, I think that's premature fatalism. >> And one of the one of the big ways I think you can tell is that our world leaders don't understand the uh the the dangers here and the way the people building it or the way the people in academia um or the way the people like me and the nonprofits who have been around before these companies who are all saying, "Hey, this one's different." you know, building building actually smart stuff is is different than building building um building tools and you know the heads of these labs are saying things like I think there's a 10 20% chance this kills us all.

I think that's low but the the you know in in Silicon Valley if you talk to a lot of these people it's like they've seen a ghost. >> Mhm. You know, it's people are like, "Oh man, maybe, you know, it maybe we're bringing about something that's going to be great, maybe it's going to be bad." You know, people people talk half jokingly about how you've got to make all the money you want uh to have in the next 5 years cuz once AGI comes, there's going to be like a permanent lock in. And these are the optimists who think it's going to go well, you know? Um there there's there there's sort of like a shell shocked nature in Silicon Valley of like maybe we can actually do this inside of 2 years and then who knows what the heck's going to happen.

The gene is going to be out of the bottle. In DC, people are like AI is just chat bots, >> right? >> It's just chatbots today, but the people in Silicon Valley can see how it's a moving target, can see how there's new advancements. people in DC, you know, they're they're looking at questions like, "How do we make these not talk to suicide?" They're looking at questions like, "How do we integrate this into our school systems in ways that, you know, get the benefits but don't, you know, affect people's ability to learn?" Those are real issues with integrating chat bots into our society today. But our leaders are largely not understanding that the the sort of gung-ho people building this think there's a 10 to 20% chance it kills us all.

And some of the people outside the industry are like those are low numbers, >> right? >> We're not seeing our world leaders look us in the eyes and say this has at least a 10% chance of killing all of you, but we think the gamble is worth it. Right? If that day comes, sure, maybe maybe at that point you can be like, "I don't know if we're going to be able to stop this one, guys." But but until then to say, "Oh, we're never going to stop." Of course, we're not going to stop if people don't understand the danger. Right? But step one is just make sure our leaders understand the danger. You know, that's what the book's for.

That's, you know, I'm I'm real glad you're having these sorts of conversations because I think that's part of what these conversations are for. And that, you know, one of the big things people can do is just call their reps and say, "I'm worried about where AI is going. I think it'll endanger us if these companies succeed at their stated goals." I speak to a lot of politicians on this issue. Some of them are now starting to come out and say, "I think there's dangers here." There's a lot more of them who are worried but feel like they can't say it out loud because they worry it'll sound crazy or they worry that they'll piss off, you know, the big tech lobbies.

Just knowing that their constituents are concerned, I think can go a long way. >> Absolutely. And I have found you'd be surprised at how much they want to hear from their constituents. And sure, one person sending an email, calling, speaking to their a state representative of any kind. No, that's not going to to change everything. But I have heard directly from the the horse's mouth from a number of representatives in California. As soon as you hear from a group of people about something where there's multiple emails coming in, multiple calls coming in, they take notice of it because they do understand that that's that is their job.

They are they're not going to get reelected if they completely ignore what everyone's saying. And if there's a ground swell of concern, suddenly these leaders who are in positions to actually make decisions about this can start to do something about it. >> I think smaller groups than you might think can matter more than you might think. Um especially because a lot of these people >> already harbor their own concerns. You know, I've been in conversations with some of these folk where um it it turned out the the representative or or the elected official already was concerned. I was like, "Oh my god, finally I can talk to somebody about this cuz it's been sort of haunting me a little." Um and uh so few people actually call their reps that even a small handful can can um can start to give them some courage, I think, um and inspire them to take leadership.

Um and then you know the the other big thing I think each and every one of us can do is when someone says it's inevitable you can push back against that. >> Yeah. >> There's there's all sorts of cases of technology that uh would have been beneficial that humanity has been like no thank you. Maybe even cases where we shouldn't have been like no thank you. You know we we build a lot less nuclear power plants than we should. I think >> um you know I think that that you know there's people in me don't agree with me on that but my take is is we should do more nuclear power because I think it's um you know less dangerous than the alternatives if you're if you're sort of dumping cold dust in the atmosphere that sort of get gets into a lot of lungs.

Um but humanity sort of backed off on on nuclear energy. Uh humanity also backed off on human cloning. >> You know that's a whole separate question of whether that was a good idea but we sure as heck backed off on it. you know, that could have benefited quite a lot of people. Uh uh it could have lined quite a lot of pocketbooks. Um you know, we we don't do supersonic um passenger flights. Maybe we should have, but we don't. You know, there's the whole Food and Drug Administration. My guess is it probably uh makes it too hard to make new drugs. Uh and my guess is that more people are dying due to drugs that are get bogged down in you know 10 billion dollar 10-year trials uh to get like that last unit.

You know my my guess is that more people are being killed of drugs that don't come out than drugs that do come out and are bad. There's all sorts of cases many which humanity maybe shouldn't have done where we were like hey let's slow down on this technological pathway even though it would benefit a lot of people. It would be so silly if in making what's essentially successor species in making machines that can think better and faster than us, if that was the one case or a one case that we didn't slow down, you know, it's it's it would be embarrassing. We totally have the ability >> to to put a stop to this stuff.

And >> you know, pushing back against the fatalism, pushing back against the defeatism that starts with each and every one of us saying, "No, we don't have to rush into it. It is a choice and we can make the right one." >> Uh yes. And our leaders should read this book. Again, if anyone builds it, everyone dies. If you could say just to wrap things up here, one quick note to those leaders besides go read the book. Uh what would that be? >> I think a lot of folks these days are saying if we don't rush to build it, some foreign adversary will rush to build it instead. And so we need to go full steam ahead.

Uh I think that a if you think that even in the face of the huge dangers here, you should be able to look people in the eyes and say, you know, we think this has a 10% plus chance of killing you all, maybe much higher depending which experts you listen to. We think it's worth the gamble anyway. Uh I think you probably shouldn't be able to say that because I think I think it would be crazy. And that that does not mean letting adversaries do it first. >> If you have a situation where if you do something that risks a 10 plus% chance of killing every man, woman, and child on the planet and you worry that someone else is going to do that instead.

The answer is not to get there first yourself. The answer is to make sure they don't do it either. That's a capability we in fact possess. The sort of smart way to do this would be through some you know international agreement which can happen. You know the nuclear nonproliferation treaty happened at the height of the cold war but and the the ideological differences between the US and the USSR were huge but they both agreed we didn't want to die of this right but even if you think a treaty is not possible we should be developing the intelligence to know who's trying to do this stuff. We should be developing the ability to sabotage it.

The the stuckset virus in 1996 uh shut down the Iranian nuclear facilities because our world leaders took seriously that they have to stop rogue nations from developing these dangerous capabilities. There's lots of options for stopping people from taking these crazy risks that aren't Russia ourselves. And at the very least, uh, we should be a signaling to the world that we think this is too dangerous and that everyone should stop and b developing the ability to tell which rogue actors are rushing ahead anyway. Uh, and find a way to to make that not happen because it it threatens each and all of our lives.

>> Well, Nate, thank you so much for all of this. I hope that major decision makers in Washington DC become aware of the issues and the dangers that we are facing. Again, the book is if anyone builds it, everyone dies, why superhuman AI would kill us all. And if anyone wants to follow up online to learn more about the work you're doing, where can they find you for that? >> Uh my organization, the Machine Intelligence Research Institute, is at intelligence.org. Um, and you also may be interested in some resources to help you contact your representatives at if anyonebuilds.com/act. >> Fantastic. Thank you so much >> for having us coming on today for the work you're doing because I say that a lot to people, but this is one where we go like this could be the most important question of our time.

So, sincerely, thank you for the work you're doing. >> Well, thanks for having me here. And, you know, I wish I could say um that I'll be I'll be really busy uh on the whiteboards trying to figure out how to solve it, but these days, I think the solution comes from more people understanding the issue. And I think it's conversations like this one and and stuff like you're doing that um that really helps at this point. >> Okay, everybody. Until next time, ask questions, don't accept the status quo, and be curious. The Nick Stanley Show.

Watch the full episode on YouTube →

Share & spread the word

The Newsletter

Never miss an episode

New episodes plus the sharpest ideas from each conversation — straight to your inbox. No spam, unsubscribe anytime.