https://medium.com/@richardcngo/visualizing-the-deep-learning-revolution-722098eb9c5

Introduction

This post aims to convey three ideas using a series of illustrative examples:

  1. There have been huge jumps in the capabilities of AIs over the last decade, to the point where it’s becoming hard to specify tasks that AIs can’t do.
  2. This progress has been primarily driven by scaling up a handful of relatively simple algorithms (rather than by developing a more principled or scientific understanding of deep learning).
  3. Very few people predicted that progress would be anywhere near this fast; but many of those who did also predict that we might face existential risk from AGI in the coming decades.

I’ll focus on four domains: vision, games, language-based tasks, and science.

Vision

Image recognition

Image recognition has been a focus of AI for many decades. Early research focused on simple domains like handwriting; performance has now improved significantly, beating human performance on many datasets.

Image generation

In 2014, AI image generation advanced significantly with the introduction of Generative Adversarial Networks (GANs). However, the first GANs could only generate very simple or blurry images.

The key underlying factor for today's progress was scaling up the amount of computing and data used during training.

Video generation

In 2019, although the videos had some realistic features, almost all of the individual videos were noticeably malformed.

More recently, researchers have focused on producing videos in response to text prompts.

Games