Artificial General Intelligence and Existential Risk

The purpose of this post is to discuss existential risk, and why artificial intelligence is a relatively important aspect of existential risk to consider. There are other essays about the dangers of artificial intelligence that I will link to throughout and at the end of this post. This essay is a different approach that perhaps will appeal to someone who has not seriously considered artificial general intelligence as an issue requiring civilization’s attention. At the very least, I’d like to signal that it should be more socially acceptable to discuss this problem.

First is the section on how I approached thinking about existential risk. My train of thought is a follow up to Efficient Advocacy. Also worth reading: Electoral Reform Fantasies.

Background

Political fights, especially culture war battles that President Trump seems so fond of, are loud, obnoxious, and tend to overshadow more impactful policy debates. For example, abortion debates are pretty common, highly discussed political issues, but there have been almost no major policy changes since the Supreme Court’s decision 40 years ago.  The number of abortions in the US has declined since the 1980s, but it seems uncorrelated with any political movements or electoral victories. If there aren’t meaningful differences from different political outcomes, and if political effort, labor, and capital is limited, these debates seem to distract from other areas that could impact more people. Trump seems especially good at finding meaningless conflicts to divide people, like NFL players’ actions during the national anthem or tweeting about Lavar Ball’s son being arrested in China.

Theorizing about how to combat this problem, I started making a list of what might be impactful-but-popular (or at least not unpopular) policies that would make up an idealized congressional agenda: nominal GDP futures markets, ending federal prohibition of marijuana, upgrading Social Security Numbers to be more secure, reforming bail. However, there is a big difference between “not unpopular”, “popular”, and “prioritized”. I’m pretty sure nominal GDP futures markets would have a pretty positive effect on Federal Reserve policy, and I can’t think of any political opposition to it, but almost no one is talking about it. Marijuana legalization is pretty popular across most voters, but it’s not a priority, especially for this congress. So what do you focus on? Educating more people about nominal GDP futures markets so they know such a solution exists? Convincing more people to prioritize marijuana legalization?

The nagging problem is that effective altruist groups like GiveWell have taken a research based approach to identify at what the best ways are to use our money and time to improve the world. For example, the cost of distributing anti-mosquito bed nets is extremely low, resulting in an average life saved from malaria at a cost in the thousands of dollars. The result is that we now know our actions have a significant opportunity cost; if a few thousand dollars worth of work or donations doesn’t obviously have as good an impact as literally saving someone’s life, we need a really good argument as to why we should do that activity as opposed to contributing to GiveWell’s top charities.

One way to make a case as to why there are other things worth spending money on besides GiveWell’s top charities, is to take a long term outlook, trying to effect a large change that would impact a large amount of people in the future.  For example, improving institutions in various developing countries would help those populations become richer. Another approach would be to improve the global economy, which would both allow for more investment in technology as well as push investment into developing countries looking for returns. Certainly long term approaches are more risky compared to direct impact charities that improve outcomes as soon as possible, but long term approaches can’t be abandoned either.

Existential Risk

So what about the extreme long term? What about existential risk? This blog’s philosophy takes consequentialism as a founding principle, and if you’re interested in the preceding questions of what policies are the most helpful, and where we should focus our efforts, you’ve already accepted that we should be concerned about the effects of our actions. The worst possible event, from a utilitarian perspective would be the extinction of the human race, as it would not just kill all the humans alive today (making it worse than a catastrophe that only kills half the humans), but also ends the potential descendants of all of humanity, possibly trillions of beings. If we have any concern for the the outcomes of our civilization, we must investigate sources of existential risk. Another way to state this is: assume it’s the year 2300, and humans no longer exist in the universe. What is the most likely cause of our destruction?

Wikipedia actually has a very good article on Global Catastrophic Risk, which is a broad category encompassing things that could seriously harm humanity on a global scale. Existential risks are a strict subset of those events, which could end humanity’s existence permanently. Wikipedia splits them up into natural and anthropogenic. First, let’s review the non-anthropogenic risks (natural climate change, megatsunamis, asteroid impacts, cosmic events, volcanism, extraterrestrial invasion, global pandemic) and see whether they qualify as existential.

Natural climate change and megatsunamis do not appear to be existential in nature. A megatsunami would be terrible for everyone living around the affected ocean, but humans on the other side of the earth would appear to be fine. Humans can also live in a variety of climates, so natural climate change would likely be slow enough for some humans to adapt, even if such an event causes increased geopolitical tensions.

Previous asteroid impacts have had very devastating impacts on Earth, notably the Cretaceous-Paleocene extinction event some 66 million years ago. This is a clear existential risk, but you need a pretty large asteroid to hit Earth, which is unusual. Larger asteroids can also be more easily identified from further away, giving humanity more time to do something (push it off path, blow it up, etc). The chances here are thus pretty low.

Other cosmic events are also low probability. Gamma-ray bursts are pretty devastating, but they’d have to be close-by (with a few hundred light-years at least) as well as aimed directly at Earth. Neither of these is likely within the next million years.

Volcanism is also something that has the potential to be pretty bad, perhaps existential level (see Toba Catastrophe Theory), but it is also pretty rare.

An alien invasion could easily destroy all of humanity. Any species with the capability to travel across interstellar space with military ambitions would mean they are extremely technologically superior. However, we don’t see any evidence of a galactic alien civilization (see Fermi Paradox 1 & 2 and The Great Filter). Additionally, solving this problem seems somewhat intractable; on a cosmic timescale, an alien civilization that arose before our own would likely have preceded us by millennia, meaning the technology gap between us and them would be hopelessly and permanently large.

A global pandemic seems pretty bad, certainly much more likely than anything else we’ve covered in the short term. This is also exacerbated by human actions creating a more interconnected globe. However, it is counterbalanced by the fact that no previous pandemic has ever been 100% lethal, and that modern medicine is much better than it was during the Black Plague. This is a big risk, but it may not be existential. Definitely on our shortlist of things-to-worry-about though.

Let’s talk about anthropogenic risks next: nuclear war, conventional war, anthropogenic climate change, agricultural crises, mineral exhaustion, artificial intelligence, nanotechnology, biotechnology.

A common worry is nuclear war. A massive nuclear exchange seems somewhat unlikely today, even if a regional disagreement in the Korean peninsula goes poorly in the worst possible way. It’s not common knowledge, but the “nuclear winter” scenario is still somewhat controversial, and I remain unconvinced that it poses a serious existential threat, although clearly a nuclear exchange would kill millions. Conventional war is also out as it seems strictly less dangerous than a nuclear war.

For similar reasons to nuclear winter, I’m not quite worried about global warming on purely existential terms. Global warming may be very expensive, it may cause widespread weather, climate, and ecological problems, but I don’t believe humanity will be entirely wiped out. I am open to corrections on this.

Agricultural crises and mineral exhaustion seem pretty catastrophic-but-not-existential as well. These would result in economic crises, but by definition, economic crises need humans to exist; if there are fewer humans, it seems that an agricultural crisis would no longer be an issue.

The remaining issues are largely technological in nature: artificial intelligence, biotechnology, nanotechnology, or technical experiments going wrong (like if the first nuclear test set the atmosphere on fire). These all seem fairly concerning.

Technological Existential Risk

Concern arises because technological progress means the likelihood that we will have these technologies grows over time, and, once they exist, we would expect their cost to decrease. Additionally, unlike other topics listed here, these could wipe out humanity permanently. For example, a bioengineered virus could be far more deadly than what would naturally occur, possibly resulting in a zero survival rate. The cost of DNA technology has steadily dropped, and so over time, we might expect the number of organizations or people who have the knowledge and funding to engineer deadly pathogens to increase. The more people who have this ability, the more likely that someone makes a mistake and releases a deadly virus that kills everyone. An additional issue is that it is quite likely that military research teams are right now researching bioweapons like an engineered pathogen. Incentives leading to the research of dangerous weapons like this are unlikely to change, even as DNA engineering improves, meaning the risk of this threat should grow over time.

Nanotechnology also has the potential to end all life on the planet, especially under a so-called “grey goo” scenario, where nanobots transform all the matter on Earth. This has a lot of similarities to a engineered pathogen, except the odds of any human developing some immunity no longer matter, and additionally all non-human life, indeed, all matter on Earth is also forfeit, not just the humans. Like biotechnology threats, we don’t have this technology yet, but it is an active area of research. We would also expect this risk to grow over time.

Artificial General Intelligence

Finally, artificial general intelligence contains some similar issues to the others: as technology advances, we have a higher chance of creating it; the more people who can create it, the more dangerous it is; once it is created, it could be deadly.

This post isn’t a thesis on why AI is or isn’t going to kill all humans. We made an assumption that we were looking exclusively at existential risk in the near future of humanity. Given that assumption, our question is why will AI be more likely to end humanity than anything else? Nonetheless, there are lingering questions as to whether AI is an actual “real” threat to humanity, or just an unrealistic sci-fi trope. I will outline three basic objections to AI being dangerous with three basic counterarguments.

The first objection is that AI itself will not be dangerous because it will be too stupid. Related points are that AI is too hard to create, or we can just unplug it if it has differing values from us. Counterarguments are that experts disagree on exactly when we can create human-level AI, but most agree that it’s plausible in the next hundred or couple hundred years (AI Timelines). It’s also true that we’ve seen improvements in AI ability to solve more general and more complex problems over time; AlphaZero learned how to play both Go and Chess better than any human without changes in its base code, YouTube uses algorithms to determine what content to recommend and what content to remove ads from, scanning through thousands of hours of video content every minute, Google’s Pixel phone can create software based portrait photos via machine learning rather than needing multiple lenses. We should expect this trend to continue, just like with other technologies.

However, the difference between other technological global risks and AI is that the machine learning optimization algorithms could eventually be applied to machine learning itself. This is the concept of an “intelligence explosion”, where an AI uses its intelligence to design and create successively better versions of itself. Thus, it’s not just that an organization might make a dangerous technological breakthrough, like an engineered virus, but that once the breakthrough occurs, the AI would rapidly become uncontrollable and vastly more intelligent than us. The intelligence analogy being that a mouse isn’t just less smart than a human, it literally doesn’t comprehend that its environment can be so manipulated by humans that entire species depend on the actions of humans (i.e. conservation, rules about overhunting) for their own survival.

Another objection is that if an AI is actually as intelligent as we fear it could be, it wouldn’t make “stupid” mistakes like destroying all of humanity or consuming the planet’s resources, because that wouldn’t count as “intelligent”. The counterpoint is the Orthogonality Thesis. This simply states that an AI can have any goal. Intelligence and goals are orthogonal and independent. Moreover, an AI’s goal does not have to explicitly target humans as bad (e.g. “kill all the humans”) to cause us harm. For example, a goal to calculate all the digits of pi or solve the Riemann Hypothesis might require as much computing power as possible. As part of achieving this goal, a superintelligence would determine that it must manufacture computing equipment and maximize energy to its computation equipment. Humans use energy and are made of matter, so as a way to achieve its goal, it would likely exterminate humanity, and convert all matter it could into computation equipment. Due to its superintelligence, it would accomplish this.

A final objection is that despite experts believing human level AI will happen in the next 100 years, if not sooner, there is nothing to be done about it today or that it is a waste of time to work on this problem now. This is also known as the “worrying about overpopulation on Mars” objection, comparing the worry about AI to something that is several scientific advancements away.  Scott Alexander has an entire blog post on this subject, which I recommend checking out. The basic summary is that AI advancement and AI alignment research are somewhat independent. And we really need to learn how to properly align AI values before we get human level AI.

We have a lot of theoretical philosophy that we need to figure out how to impart to a computer. Things like how humans actually make decisions, or how to value different moral tradeoffs. This could be extraordinarily complicated, as an extremely smart optimization algorithm could misinterpret almost everything we say if it did not already share our values for human life, health, and general brain state. Computer scientists set out to teach computers how to understand natural human language some 60 years ago, and we still haven’t quite nailed it. If imparting philosophical truths is similarly difficult, there is plenty of work to be done today.

Artificial intelligence could advance rapidly from human level to greater than human very quickly; the best human Go player lost to an AI (AlphaGo) in 2016, and a year later, AlphaGo lost to a new version, AlphaGo Zero, 100 games to none. It would thus not be surprising if a general intelligence achieved superhuman status a year after achieving human-comparable status, or sooner. There’s no fire alarm for artificial general intelligence. We need to be working on these problems as soon as possible.

I’d argue then, that of all scenarios listed here, a misaligned AI is the most likely to actually destroy all of humanity as a result of the Orthogonality Thesis. I also think that unlike many of the other scenarios listed here, human level AI will exist sometime soon, compared to the timescale of asteroids and vulcanism (see AI Timelines, estimates are highly variable, anywhere from 10 to 200 years). There is also a wealth of work to be done surrounding AI value alignment. Correctly aligning future AI with goals compatible with human values is thus one of the most important challenges facing our civilization within the next hundred years or so, and probably the most important existential threat we face.

The good news is that there are some places doing this work, notably the Machine Intelligence Research Institute, OpenAI, and the Future of Humanity Institute. The bad news is that despite the importance of this issue, there is very little in the way of conversations, money, or advocacy. AI Safety research is hard to calculate in total, as some research is likely done by private software companies, but is optimistically on the order of tens of millions of dollars a year. By comparison, the U.S. Transportation Security Administration, which failed to find 95% of test weapons in a recent audit, costs $7.5 billion a year.

Further Reading

I have focused this essay on trying to convey the mindset of thinking about existential risk generally and why AI is specifically worrying in this context. I’ve also tried to keep it short. The following are further resources on the specifics of why Artificial General Intelligence is worth worrying about in a broader context, arranged by length. If you felt my piece did not go in depth enough on whether AI itself is worth being concerned about, I would urge you to read one of the more in depth essays here which focus on that question directly.

 


Leave a comment on the official reddit thread.