On the Strengths of Pure Evolutionary Algorithms in Generating Adversarial Examples


Deep learning (DL) models are known to be highly accurate, yet vulnerable to adversarial examples. While earlier research focused on generating adversarial examples using whitebox strategies, later research focused on black-box strategies, as models often are not accessible to external attackers. Prior studies showed that black-box approaches based on approximate gradient descent algorithms combined with meta-heuristic search (i.e., the BMI-FGSM algorithm) outperform previously proposed white- and black-box strategies. In this paper, we propose a novel black-box approach purely based on differential evolution (DE), i.e., without using any gradient approximation method. In particular, we propose two variants of a customized DE with customized variation operators: (1) a single-objective (Pixel-SOO) variant generating attacks that fool DL models, and (2) a multi-objective variant (Pixel-MOO) that also minimizes the number of changes in generated attacks. Our preliminary study on five canonical image classification models shows that Pixel-SOO and Pixel-MOO are more effective than the state-of-the-art BMI-FGSM in generating adversarial attacks. Furthermore, Pixel-SOO is faster than Pixel-MOO, while the latter produces subtler attacks than its single-objective variant.

The 16th Intl. Workshop on Search-Based and Fuzz Testing (SBFT)