Introduction

Generative AI allows people to produce piles upon piles of images and words very quickly. It would be nice if there were some way to reliably distinguish AI-generated content from human-generated content. It would help people avoid endlessly arguing with bots online, or believing what a fake image purports to show.

Solutions

One common proposal is that big companies should incorporate watermarks into the outputs of their AIs. For instance, this could involve taking an image and subtly changing many pixels in a way that’s undetectable to the eye but detectable to a computer program. Or it could involve swapping words for synonyms in a predictable way so that the meaning is unchanged, but a program could readily determine the text was generated by an AI.

Unfortunately, watermarking schemes are unlikely to work. So far most have proven easy to remove, and it’s likely that future schemes will have similar problems.

A useful watermark for AI images would need two properties:

It would need to continue to be detectable after an image is cropped, rotated, or edited in various ways (robustness).
It couldn’t be conspicuous like the watermark on stock image samples, because the resulting images wouldn’t be of much use to anybody.

One simple technique is to manipulate the least perceptible bits of an image.

There are more sophisticated watermarking proposals that are robust to a wider variety of common edits. However, proposals for AI watermarking must pass a tougher challenge. They must be robust against someone who knows about the watermark and wants to eliminate it.

Coming at the problem from the opposite direction, some companies are working on ways to prove that an image came from a camera (“content authenticity”). Rather than marking AI generated images, they add metadata to camera-generated images, and use cryptographic signatures to prove the metadata is genuine.

Text-based Watermarks

The watermarking problem is even harder for text-based generative AI. Similar techniques can be devised. For instance, an AI could boost the probability of certain words, giving itself a subtle textual style that would go unnoticed most of the time, but could be recognized by a program with access to the list of words.

Any watermark based on word choice is likely to be defeated by some amount of rewording. That rewording could even be performed by an alternate AI, perhaps one that is less sophisticated than the one that generated the original text, but not subject to a watermarking requirement.

There’s also a problem of whether the tools to detect watermarked text are publicly available or are secret. Making detection tools publicly available gives an advantage to those who want to remove watermarking, because they can repeatedly edit their text or image until the detection tool gives an all clear.

Lastly, if AI watermarking is to prevent disinformation campaigns sponsored by states, it’s important to keep in mind that those states can readily develop modern generative AI, and probably will in the near future.