Artificial General Intelligence and Existential Risk

The purpose of this post is to discuss existential risk, and why artificial intelligence is a relatively important aspect of existential risk to consider. There are other essays about the dangers of artificial intelligence that I will link to throughout and at the end of this post. This essay is a different approach that perhaps will appeal to someone who has not seriously considered artificial general intelligence as an issue requiring civilization’s attention. At the very least, I’d like to signal that it should be more socially acceptable to discuss this problem.

First is the section on how I approached thinking about existential risk. My train of thought is a follow up to Efficient Advocacy. Also worth reading: Electoral Reform Fantasies.


Political fights, especially culture war battles that President Trump seems so fond of, are loud, obnoxious, and tend to overshadow more impactful policy debates. For example, abortion debates are pretty common, highly discussed political issues, but there have been almost no major policy changes since the Supreme Court’s decision 40 years ago.  The number of abortions in the US has declined since the 1980s, but it seems uncorrelated with any political movements or electoral victories. If there aren’t meaningful differences from different political outcomes, and if political effort, labor, and capital is limited, these debates seem to distract from other areas that could impact more people. Trump seems especially good at finding meaningless conflicts to divide people, like NFL players’ actions during the national anthem or tweeting about Lavar Ball’s son being arrested in China.

Theorizing about how to combat this problem, I started making a list of what might be impactful-but-popular (or at least not unpopular) policies that would make up an idealized congressional agenda: nominal GDP futures markets, ending federal prohibition of marijuana, upgrading Social Security Numbers to be more secure, reforming bail. However, there is a big difference between “not unpopular”, “popular”, and “prioritized”. I’m pretty sure nominal GDP futures markets would have a pretty positive effect on Federal Reserve policy, and I can’t think of any political opposition to it, but almost no one is talking about it. Marijuana legalization is pretty popular across most voters, but it’s not a priority, especially for this congress. So what do you focus on? Educating more people about nominal GDP futures markets so they know such a solution exists? Convincing more people to prioritize marijuana legalization?

The nagging problem is that effective altruist groups like GiveWell have taken a research based approach to identify at what the best ways are to use our money and time to improve the world. For example, the cost of distributing anti-mosquito bed nets is extremely low, resulting in an average life saved from malaria at a cost in the thousands of dollars. The result is that we now know our actions have a significant opportunity cost; if a few thousand dollars worth of work or donations doesn’t obviously have as good an impact as literally saving someone’s life, we need a really good argument as to why we should do that activity as opposed to contributing to GiveWell’s top charities.

One way to make a case as to why there are other things worth spending money on besides GiveWell’s top charities, is to take a long term outlook, trying to effect a large change that would impact a large amount of people in the future.  For example, improving institutions in various developing countries would help those populations become richer. Another approach would be to improve the global economy, which would both allow for more investment in technology as well as push investment into developing countries looking for returns. Certainly long term approaches are more risky compared to direct impact charities that improve outcomes as soon as possible, but long term approaches can’t be abandoned either.

Existential Risk

So what about the extreme long term? What about existential risk? This blog’s philosophy takes consequentialism as a founding principle, and if you’re interested in the preceding questions of what policies are the most helpful, and where we should focus our efforts, you’ve already accepted that we should be concerned about the effects of our actions. The worst possible event, from a utilitarian perspective would be the extinction of the human race, as it would not just kill all the humans alive today (making it worse than a catastrophe that only kills half the humans), but also ends the potential descendants of all of humanity, possibly trillions of beings. If we have any concern for the the outcomes of our civilization, we must investigate sources of existential risk. Another way to state this is: assume it’s the year 2300, and humans no longer exist in the universe. What is the most likely cause of our destruction?

Wikipedia actually has a very good article on Global Catastrophic Risk, which is a broad category encompassing things that could seriously harm humanity on a global scale. Existential risks are a strict subset of those events, which could end humanity’s existence permanently. Wikipedia splits them up into natural and anthropogenic. First, let’s review the non-anthropogenic risks (natural climate change, megatsunamis, asteroid impacts, cosmic events, volcanism, extraterrestrial invasion, global pandemic) and see whether they qualify as existential.

Natural climate change and megatsunamis do not appear to be existential in nature. A megatsunami would be terrible for everyone living around the affected ocean, but humans on the other side of the earth would appear to be fine. Humans can also live in a variety of climates, so natural climate change would likely be slow enough for some humans to adapt, even if such an event causes increased geopolitical tensions.

Previous asteroid impacts have had very devastating impacts on Earth, notably the Cretaceous-Paleocene extinction event some 66 million years ago. This is a clear existential risk, but you need a pretty large asteroid to hit Earth, which is unusual. Larger asteroids can also be more easily identified from further away, giving humanity more time to do something (push it off path, blow it up, etc). The chances here are thus pretty low.

Other cosmic events are also low probability. Gamma-ray bursts are pretty devastating, but they’d have to be close-by (with a few hundred light-years at least) as well as aimed directly at Earth. Neither of these is likely within the next million years.

Volcanism is also something that has the potential to be pretty bad, perhaps existential level (see Toba Catastrophe Theory), but it is also pretty rare.

An alien invasion could easily destroy all of humanity. Any species with the capability to travel across interstellar space with military ambitions would mean they are extremely technologically superior. However, we don’t see any evidence of a galactic alien civilization (see Fermi Paradox 1 & 2 and The Great Filter). Additionally, solving this problem seems somewhat intractable; on a cosmic timescale, an alien civilization that arose before our own would likely have preceded us by millennia, meaning the technology gap between us and them would be hopelessly and permanently large.

A global pandemic seems pretty bad, certainly much more likely than anything else we’ve covered in the short term. This is also exacerbated by human actions creating a more interconnected globe. However, it is counterbalanced by the fact that no previous pandemic has ever been 100% lethal, and that modern medicine is much better than it was during the Black Plague. This is a big risk, but it may not be existential. Definitely on our shortlist of things-to-worry-about though.

Let’s talk about anthropogenic risks next: nuclear war, conventional war, anthropogenic climate change, agricultural crises, mineral exhaustion, artificial intelligence, nanotechnology, biotechnology.

A common worry is nuclear war. A massive nuclear exchange seems somewhat unlikely today, even if a regional disagreement in the Korean peninsula goes poorly in the worst possible way. It’s not common knowledge, but the “nuclear winter” scenario is still somewhat controversial, and I remain unconvinced that it poses a serious existential threat, although clearly a nuclear exchange would kill millions. Conventional war is also out as it seems strictly less dangerous than a nuclear war.

For similar reasons to nuclear winter, I’m not quite worried about global warming on purely existential terms. Global warming may be very expensive, it may cause widespread weather, climate, and ecological problems, but I don’t believe humanity will be entirely wiped out. I am open to corrections on this.

Agricultural crises and mineral exhaustion seem pretty catastrophic-but-not-existential as well. These would result in economic crises, but by definition, economic crises need humans to exist; if there are fewer humans, it seems that an agricultural crisis would no longer be an issue.

The remaining issues are largely technological in nature: artificial intelligence, biotechnology, nanotechnology, or technical experiments going wrong (like if the first nuclear test set the atmosphere on fire). These all seem fairly concerning.

Technological Existential Risk

Concern arises because technological progress means the likelihood that we will have these technologies grows over time, and, once they exist, we would expect their cost to decrease. Additionally, unlike other topics listed here, these could wipe out humanity permanently. For example, a bioengineered virus could be far more deadly than what would naturally occur, possibly resulting in a zero survival rate. The cost of DNA technology has steadily dropped, and so over time, we might expect the number of organizations or people who have the knowledge and funding to engineer deadly pathogens to increase. The more people who have this ability, the more likely that someone makes a mistake and releases a deadly virus that kills everyone. An additional issue is that it is quite likely that military research teams are right now researching bioweapons like an engineered pathogen. Incentives leading to the research of dangerous weapons like this are unlikely to change, even as DNA engineering improves, meaning the risk of this threat should grow over time.

Nanotechnology also has the potential to end all life on the planet, especially under a so-called “grey goo” scenario, where nanobots transform all the matter on Earth. This has a lot of similarities to a engineered pathogen, except the odds of any human developing some immunity no longer matter, and additionally all non-human life, indeed, all matter on Earth is also forfeit, not just the humans. Like biotechnology threats, we don’t have this technology yet, but it is an active area of research. We would also expect this risk to grow over time.

Artificial General Intelligence

Finally, artificial general intelligence contains some similar issues to the others: as technology advances, we have a higher chance of creating it; the more people who can create it, the more dangerous it is; once it is created, it could be deadly.

This post isn’t a thesis on why AI is or isn’t going to kill all humans. We made an assumption that we were looking exclusively at existential risk in the near future of humanity. Given that assumption, our question is why will AI be more likely to end humanity than anything else? Nonetheless, there are lingering questions as to whether AI is an actual “real” threat to humanity, or just an unrealistic sci-fi trope. I will outline three basic objections to AI being dangerous with three basic counterarguments.

The first objection is that AI itself will not be dangerous because it will be too stupid. Related points are that AI is too hard to create, or we can just unplug it if it has differing values from us. Counterarguments are that experts disagree on exactly when we can create human-level AI, but most agree that it’s plausible in the next hundred or couple hundred years (AI Timelines). It’s also true that we’ve seen improvements in AI ability to solve more general and more complex problems over time; AlphaZero learned how to play both Go and Chess better than any human without changes in its base code, YouTube uses algorithms to determine what content to recommend and what content to remove ads from, scanning through thousands of hours of video content every minute, Google’s Pixel phone can create software based portrait photos via machine learning rather than needing multiple lenses. We should expect this trend to continue, just like with other technologies.

However, the difference between other technological global risks and AI is that the machine learning optimization algorithms could eventually be applied to machine learning itself. This is the concept of an “intelligence explosion”, where an AI uses its intelligence to design and create successively better versions of itself. Thus, it’s not just that an organization might make a dangerous technological breakthrough, like an engineered virus, but that once the breakthrough occurs, the AI would rapidly become uncontrollable and vastly more intelligent than us. The intelligence analogy being that a mouse isn’t just less smart than a human, it literally doesn’t comprehend that its environment can be so manipulated by humans that entire species depend on the actions of humans (i.e. conservation, rules about overhunting) for their own survival.

Another objection is that if an AI is actually as intelligent as we fear it could be, it wouldn’t make “stupid” mistakes like destroying all of humanity or consuming the planet’s resources, because that wouldn’t count as “intelligent”. The counterpoint is the Orthogonality Thesis. This simply states that an AI can have any goal. Intelligence and goals are orthogonal and independent. Moreover, an AI’s goal does not have to explicitly target humans as bad (e.g. “kill all the humans”) to cause us harm. For example, a goal to calculate all the digits of pi or solve the Riemann Hypothesis might require as much computing power as possible. As part of achieving this goal, a superintelligence would determine that it must manufacture computing equipment and maximize energy to its computation equipment. Humans use energy and are made of matter, so as a way to achieve its goal, it would likely exterminate humanity, and convert all matter it could into computation equipment. Due to its superintelligence, it would accomplish this.

A final objection is that despite experts believing human level AI will happen in the next 100 years, if not sooner, there is nothing to be done about it today or that it is a waste of time to work on this problem now. This is also known as the “worrying about overpopulation on Mars” objection, comparing the worry about AI to something that is several scientific advancements away.  Scott Alexander has an entire blog post on this subject, which I recommend checking out. The basic summary is that AI advancement and AI alignment research are somewhat independent. And we really need to learn how to properly align AI values before we get human level AI.

We have a lot of theoretical philosophy that we need to figure out how to impart to a computer. Things like how humans actually make decisions, or how to value different moral tradeoffs. This could be extraordinarily complicated, as an extremely smart optimization algorithm could misinterpret almost everything we say if it did not already share our values for human life, health, and general brain state. Computer scientists set out to teach computers how to understand natural human language some 60 years ago, and we still haven’t quite nailed it. If imparting philosophical truths is similarly difficult, there is plenty of work to be done today.

Artificial intelligence could advance rapidly from human level to greater than human very quickly; the best human Go player lost to an AI (AlphaGo) in 2016, and a year later, AlphaGo lost to a new version, AlphaGo Zero, 100 games to none. It would thus not be surprising if a general intelligence achieved superhuman status a year after achieving human-comparable status, or sooner. There’s no fire alarm for artificial general intelligence. We need to be working on these problems as soon as possible.

I’d argue then, that of all scenarios listed here, a misaligned AI is the most likely to actually destroy all of humanity as a result of the Orthogonality Thesis. I also think that unlike many of the other scenarios listed here, human level AI will exist sometime soon, compared to the timescale of asteroids and vulcanism (see AI Timelines, estimates are highly variable, anywhere from 10 to 200 years). There is also a wealth of work to be done surrounding AI value alignment. Correctly aligning future AI with goals compatible with human values is thus one of the most important challenges facing our civilization within the next hundred years or so, and probably the most important existential threat we face.

The good news is that there are some places doing this work, notably the Machine Intelligence Research Institute, OpenAI, and the Future of Humanity Institute. The bad news is that despite the importance of this issue, there is very little in the way of conversations, money, or advocacy. AI Safety research is hard to calculate in total, as some research is likely done by private software companies, but is optimistically on the order of tens of millions of dollars a year. By comparison, the U.S. Transportation Security Administration, which failed to find 95% of test weapons in a recent audit, costs $7.5 billion a year.

Further Reading

I have focused this essay on trying to convey the mindset of thinking about existential risk generally and why AI is specifically worrying in this context. I’ve also tried to keep it short. The following are further resources on the specifics of why Artificial General Intelligence is worth worrying about in a broader context, arranged by length. If you felt my piece did not go in depth enough on whether AI itself is worth being concerned about, I would urge you to read one of the more in depth essays here which focus on that question directly.


Leave a comment on the official reddit thread. 

The Age of Em


I recently had the opportunity to see George Mason Professor Robin Hanson talk about his book, The Age of Em. I also was able to work my way into having a long conversation with him after his presentation.

For those who don’t know, it’s perhaps the strangest book you’ve ever heard of. Hanson looks to project forward in time when the technology exists to easily upload human brains into computer simulations. These “emulated” brains will have certain characteristics from residing in computer hardware: they can make copies of themselves, save versions of themselves for later, or delete versions of themselves. They will even be able to run faster or slower than normal human brains depending on what hardware they are running on. Hanson spends the book working through the implications of this new society. And there are a lot of fascinating insights.

Hanson discusses the pure physics of this world, as suddenly speed of light delays in communication mean a lot; if an em is running at a million times human speed, then a bad ping of 50 ms is equivalent to over 12 hours for a message to get sent today. This leads to very close physical locations of ems, which concentrates them in large cities. Their economy also grows much faster than ours due to the rapid speed at which their brains are thinking, although they may be physically restrained by how quickly the physical manufacturing of their hardware can occur. The economy also quickly moves to subsistence wages, as even the most productive members of society can have their brains copied as many times as needed to fill all roles. Elon Musk is no longer a one of kind genius, and in fact anyone who cannot compete with an Elon Musk version in their job would likely be cast aside. For a more detailed summary and examples of bizarre ideas, I recommend Part III of Scott Alexander’s post on the book.


In that blog post, Scott goes on to discuss in Part IV the problem of value drift. Hanson does a good job pointing out that past human societies would not have approved of what we now consider acceptable. In some areas, the change in values in stunning. Merely 10 years ago, many had reservations about gay marriage. Merely 50 years ago, many Americans had serious reservations about interracial marriage.  On the scale of humans’ existence as a species, the amount of time we have accepted that people have the right to worship their own religion is minuscule. The section of human history where subsistence existence was not the only option is likewise small. Professor Hanson told our group that by far the most common reaction to his painting of the future was rejection.

I even asked him specifically about it: Hanson had stated several times that it was not his job or intention to make us like or hate this future, only to know about it. I pointed out that many AI researchers were very concerned about the safety of artificial intelligence and what it might do if it hits an intelligence explosion. To me, there seems to be little difference between the AI intelligence explosion and the Em economy explosion. Both would be human creations, making decisions and changing their values rapidly, at a pace that leaves most “normal” traditional physical humans behind. If many of the smartest people studying AI think that we should do a lot of work to make sure AI values line up with our own, shouldn’t we do the same thing with Ems? Hanson’s answer was basically that if we want to control the value systems of our descendants thousands of mental years in the future, well good luck with that.

Scott in Part IV of his review demonstrates the problem with just allowing this value drift to happen. Hanson calls the era we live in the “dream time” since it’s evolutionarily unusual for any species to be wealthy enough to have any values beyond “survive and reproduce”. For most of human history, there wasn’t much ability to build cities or share knowledge because too many resources were focused on survival. Today, we have become so productive and intelligent that humans have elevated Earth’s carrying capacity high above the amount of people we have. We don’t have to spend all our resources on survival and so we can come up with interesting philosophical ideas about morality and what the meaning of life is. We’ve also harnessed this evolutionary competitiveness to fuel our market economy where the determiner of what survives isn’t nature, but human desires. Unfortunately when you switch to the Age of Em, suddenly the most productive part of the economy is plunged back into a Malthusian trap with all resources going to keep the Ems alive. Fulfilling human wants may be what drives the economy, but if there are other pressures on Ems, they will be willing to sacrifice any values they have to keep themselves alive and competitive. If the economy gives up on fulfilling human demand, I wouldn’t call that a drift in values, I’d call that an absence of values.

If we live in the dream time, then we live in a unique situation where only we can comprehend and formulate higher morality and philosophical purpose. I think we should take advantage of that if we can.


Hanson’s observations given his assumption that the Age of Em will happen are excellent, considering he is predicting far into the future. It’s likely things won’t work out exactly this way, as perhaps a single company will have a patent on brain scanning for a decade before the market really liberalizes; this could seriously delay the rapid economic growth Hanson sees. He acknowledges this, and keeps his book more of a prediction of what will happen if we don’t oppose this change. I’m not sure how far Hanson believes that regulation/intellectual property will not be able to thwart the age of em, but it seems that he’s more confident it will not be stopped than that it will be. This may be an economist mistake where regulation is sort of assumed away as the realm of political science. It’s not unprecedented that weird inefficient institutions would last far into the future. Intellectual property in the digital age is really weird, all things considered. Software patents especially seem like a way to patent pure logic. But there are others: banking being done with paper checks, daylight savings time, the existence of pennies, and, of course, Arby’s. There are also plenty of examples of new technologies that have evolved much faster than regulation, like supplements, e-commerce, and ride-sharing. It remains to be seen what brain emulations will be.

There is also the possibility that emulated brains won’t be the next big shift in human society. Hanson argues that this shift will rival that of the agricultural revolution and the industrial revolution. This makes a lot of sense if brain emulation is indeed the next big change. Eliezer Yudkowsky (and Scott) think this is incorrect and artificial intelligence will beat it. This seems like a real possibility. Scott points out that we often come up with technological equivalents of human biology far before actually emulating biology. This is mostly because biology has accidentally figured things out via evolution and thus it is often needlessly complicated. For example, aircraft usually fly via fixed wing aerodynamics, not by flapping. It seems likely that we will reach human level problem solving via software rather than via brain scanning. Even if we don’t, it seems likely that software could quickly optimize a simulation based on a preliminary brain scan that was too rough to get a proper brain emulation into hardware. But software assisted reconstruction could start experimenting with neuron simulation and create a software assisted brain emulation that is better designed and more specialized than any human brain emulation.

It also seems possible that other things could happen first that change human history, like very expensive climate change, a crippling pandemic (anti-biotic resistance), genetic and epigenetic engineering  and of course some technological revolution we haven’t even imagined (the unknown). Certainly if we assume continued economic growth, either brain emulation, artificial intelligence, or genetic engineering seem like likely candidates to transform humanity. Hanson thinks AI research is really overrated (he used to be an AI researcher) and isn’t progressing very fast. But he was an AI researcher about 25 years ago and we’ve seen some pretty impressive improvements in machine learning and natural language processing since then. We’ve also seen some improvement in brain emulation technology as well to be fair. Genetic engineering was hailed as the next revolution in the 1990s, but has floundered ever since. Last year though, the use of CRISPR in genome engineering has dramatically increased the feasibility of actually picking and choosing specific genes. Any of these could drastically change human society. Perhaps any genetic improvements would be overshadowed by brain emulation or AI. I guess it depends on the importance of the physical world vs the digital one.

Of course, not all changes could be from improved technology. There’s a significant risk of a global multi-drug resistant pandemic. Our overuse of antibiotics, the difficulty in making everyone stop overusing them, and our highly integrated world means we’ve created an excellent scenario for a superbug to appear and spread. Anything resembling the 1918 Spanish Flu Epidemic could be devastating to the world population and to economic growth. Climate change poses a similar risk to both life and the economy. If either of these were to happen, it could significantly deter the Age of Em from occurring or at least delay it, along with a lot of the progress of our civilization. And that’s not even mentioning additional freak natural disasters like coronal mass ejections.

Overall, predictions are very difficult and if I had to bet, I’d bet that the next big change in human civilization won’t be emulated brains. A good competitor is definitely artificial superintelligence, but when you add in genetic engineering, natural disasters, drug resistant bacterial epidemics, and so on, you have to take the field over brain emulations.

Nonetheless, this book really does make you think about the world in a different way with a perspective both more global and more forward looking. It even makes you question what it means to be human. The ins and outs of the 2016 election really fade away (despite my continued interest and blogging). Political squabbling doesn’t compare to the historical trends of human civilization and the dawn of transhumanism.

Comment on reddit.