Adversarial Machine Learning ≠ Generative Adversarial Networks

GANs → Class of models training in a non cooperative way. The goal of the adversaries is to make the model stronger.

Adversarial ML → set of techniques that try to fool models by providing deceiving input.

How does it work ?

The attack manipulates the data to full the model parameters into solving gradient assent which means to try and reach a maximum for the loss function. This procedure is called Fast Gradient Sign Method (FGSM).

Untargeted vs Targeted Attacks

Untargeted → The objective is to fool the model for it to answer an arbitrary wrong answer.

Targeted → The objective is to fool the model for it to answer a specific wrong answer.

White box vs Black box Attacks

White box → We have access to the models parameters. FGSM attacks are an example.

Black box → We don’t have access to the model.

How to defend against this attacks ?

Adversarial Training

We include adversarial attack examples in the training data. This is a heuristic defense, no guaranty.