Software & Data Downloads — RIDE

Robust Iterative Data Estimation for white-box adversarial defense via self-supervised data estimation.

Recent studies have demonstrated that as classifiers, deep neural networks (e.g., CNNs) are quite vulnerable to adversarial attacks that only add quasi-imperceptible perturbations to the input data but completely change the predictions of the classifiers. To defend classifiers against such adversarial attacks, here we focus on the white-box adversarial defense where the attackers are granted full access to not only the classifiers but also defenders to produce as strong attack as possible. We argue that a successful white-box defender should prevent the attacker from not only direct gradient calculation but also a gradient approximation. Therefore we propose viewing the defense from the perspective of a functional, a high-order function that takes other functions as input and return a new function as the defender. Such a design makes the defender a hidden function, whose gradients are hard to be estimated without knowing the prior. To this end, we propose a novel Robust Iterative Data Estimation (RIDE) algorithm that works as a defender by estimating the true underlying data using each individual adversarial observation. Specifically, the RIDE algorithm takes a randomly initialized neural network as input and returns a parameterized defense model through self-supervised optimization. To the best of our knowledge, we are the first to propose novel self-supervised data estimation for white-box adversarial defense by viewing defenders as functionals.

This code implements our RIDE algorithm for adversarial defense. As demonstration we show some qualitative results of the defense against 10-iteration white-box attack (PGD attack with BPDA) on MNIST dataset using (a) median filtering, (b) total-variance minimization and (c) the proposed RIDE algorithm. This code is for our arxiv submission “White-Box Adversarial Defense via Self-Supervised Data Estimation”.

Related Publications
Zhang, Z., Lin, Z., Pfister, H., "White-Box Adversarial Defense via Self-Supervised Data Estimation", arXiv, September 2019.
BibTeX arXiv Software
- @article{Zhang2019sep,
- author = {Zhang, Ziming and Lin, Zudi and Pfister, Hanspeter},
- title = {{White-Box Adversarial Defense via Self-Supervised Data Estimation}},
- journal = {arXiv},
- year = 2019,
- month = sep,
- url = {https://arxiv.org/abs/1909.06271}
- }

Access software at https://github.com/merlresearch/RIDE.