One of the most spectacular advances in modern science has been the rise of machine vision. In just a few years, a new generation of machine learning techniques has changed the way computers see.
Because machine vision systems are so new, little is known about adversarial images. Nobody understands how best to create them, how they fool machine vision systems, or how to protect against this kind of attack.
But a problem is emerging. Machine vision researchers have begun to notice some worrying shortcomings of their new charges. It turns out machine vision algorithms have an Achilles’ heel that allows them to be tricked by images modified in ways that would be trivial for a human to spot.
Today, that starts to change thanks to the work of Kurakin and co, who have begun to study adversarial images systematically for the first time. Their work shows just how vulnerable machine vision systems are to this kind of attack. The team starts with a standard database for machine vision research, known as ImageNet. This is a database of images classified according to what they show. A standard test is to train a machine vision algorithm on part of this database and then test how well it classifies another part of the database.
The performance in these tests is measured by counting how often the algorithm has the correct classification in its top 5 answers or even its top 1 answer (its so-called top 5 accuracy or top 1 accuracy) or how often it does not have the correct answer in its top 5 or top 1 (its top 5 error rate or top 1 error rate). One of the best machine vision systems is Google’s Inception v3 algorithm, which has a top 5 error rate of 3.46 percent. Humans doing the same test have a top 5 error rate of about 5 percent, so Inception v3 really does have superhuman abilities.
Kurakin and co-created a database of adversarial images by modifying 50,000 pictures from ImageNet in three different ways. Their methods exploit the idea that neural networks process information to match an image with a particular classification. The amount of information this requires, called the cross-entropy, is a measure of how hard the matching task is.
Continue reading here.