This post will lay out a couple of stylized stories about how, if transformative AI is developed relatively soon, this could result in global catastrophe.
In the stories I’ll be telling, the world doesn't do much preparation or careful consideration of risks I’ve discussed previously, especially regarding: misaligned AI (AI forming dangerous goals of its own).
All of the following is a hypothetical scenario and its evolution in time.
A few years before transformative AI is developed, AI systems are being increasingly used for several lucrative, useful, but not dramatically world-changing things. In this early stage, AI systems often have pretty narrow capabilities, such that the idea of them forming ambitious aims and trying to defeat humanity seems silly.
Even with these relatively limited AIs, some problems and challenges could be called “safety issues” or “alignment issues.”
The most straightforward way to solve these problems involves training AIs to behave more safely and helpfully. This means that AI companies do a lot of things like “Trying to create the conditions under which an AI might provide false, harmful, evasive or toxic responses; penalizing it for doing so, and reinforcing it toward more helpful behaviors.”