Markov Chain Based Efficient Defense Against Adversarial Examples in Computer Vision

Abstract

Adversarial examples are the inputs to machine learning models that result in erroneous outputs, which are usually generated from normal inputs via subtle modification and seem to remain unchanged to human observers. They have severely threatened the applications of machine learning, especially in the areas with high-security requirements. Unfortunately, for this issue, there is neither unambiguous interpretation about the causes nor almighty defenses in spite of the increasing attention and discussions. Based on the distinguished statistical feature of Markov chain, an effective defense method is proposed in this paper by exploring the differences in the probability distributions of adjacent pixels between normal images and adversarial examples. Specifically, the concept of overall probability value (OPV) is defined to estimate the modification to an input, which can be used to preliminarily determine whether the input is an adversarial example or not. Furthermore, by calculating the OPV of an input and modifying its pixel value to destroy the potential adversarial characteristics, the proposed method can efficiently purify adversarial examples. A series of experiments demonstrate the effectiveness of the defense method. When facing various attacks, it obtains excellent performance with accuracy over 92% for MNIST and 70% for ImageNet.

DOI

10.1109/ACCESS.2018.2889409

Year

2019