Multi-objective differential evolution in the generation of adversarial examples

Abstract

Adversarial examples remain a critical concern for the robustness of deep learning models, showcasing vulnerabilities to subtle input manipulations. While earlier research focused on generating such examples using white-box strategies, later research focused on gradient-based black-box strategies, as models’ internals often are not accessible to external attackers. This paper extends our prior work by exploring a gradient-free search-based algorithm for adversarial example generation, with particular emphasis on differential evolution (DE). Building on top of the classic DE operators, we propose five variants of gradient-free algorithms: a single-objective approach (DE), two multi-objective variations (NSGA-II-DE and MOEA/D-DE), and two many-objective strategies (NSGA-II-DE and AGE-MOEA-DE). Our study on five canonical image classification models shows that whilst DE variant remains the fastest approach, NSGA-II-DE consistently produces more minimal adversarial attacks (i.e., with fewer image perturbations). Moreover, we found that applying a post-process minimization to our adversarial images, would further reduce the number of changes and overall delta variation (image noise).

Type
Publication
Science of Computer Programming