This post will lay out a couple of stylized stories about how, if transformative AI is developed relatively soon, this could result in global catastrophe.

In the stories I’ll be telling, the world doesn't do much preparation or careful consideration of risks I’ve discussed previously, especially regarding: misaligned AI (AI forming dangerous goals of its own).

People do try to “test” AI systems for safety, and they do need to achieve some level of “safety” to commercialize. When early problems arise, they react to these problems.
But this isn’t enough, because of some unique challenges of measuring whether an AI system is “safe,” and because of the strong incentives to race forward with scaling up and deploying AI systems as fast as possible.
So we end up with a world run by misaligned AI - or, even if we’re lucky enough to avoid that outcome, other catastrophes are possible.

How we could stumble into catastrophe from misaligned AI

All of the following is a hypothetical scenario and its evolution in time.

Early commercial applications

A few years before transformative AI is developed, AI systems are being increasingly used for several lucrative, useful, but not dramatically world-changing things. In this early stage, AI systems often have pretty narrow capabilities, such that the idea of them forming ambitious aims and trying to defeat humanity seems silly.

Early safety/alignment problems

Even with these relatively limited AIs, some problems and challenges could be called “safety issues” or “alignment issues.”

Give false information about the products they’re providing support for.
Give customers advice (when asked) on how to do unsafe or illegal things.
Refuse to answer valid questions.
Say toxic, offensive things in response to certain user queries.

Early solutions

The most straightforward way to solve these problems involves training AIs to behave more safely and helpfully. This means that AI companies do a lot of things like “Trying to create the conditions under which an AI might provide false, harmful, evasive or toxic responses; penalizing it for doing so, and reinforcing it toward more helpful behaviors.”