Robust Evaluation of Diffusion-Based Adversarial Purification

About

We question the current evaluation practice on diffusion-based purification methods. Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time. The approach gains increasing attention as an alternative to adversarial training due to the disentangling between training and testing. Well-known white-box attacks are often employed to measure the robustness of the purification. However, it is unknown whether these attacks are the most effective for the diffusion-based purification since the attacks are often tailored for adversarial training. We analyze the current practices and provide a new guideline for measuring the robustness of purification methods against adversarial attacks. Based on our analysis, we further propose a new purification strategy improving robustness compared to the current diffusion-based purification methods.

Minjong Lee, Dongwoo Kim• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-10 (test)	Accuracy (Clean)90.7	273
Adversarial Robustness	CIFAR-10 (test)	--	76
Adversarial Purification	CIFAR-10	Standard Accuracy90.1	68
Adversarial Robustness	CIFAR-100 (test)	--	46
Adversarial Purification	CIFAR-100	Average Accuracy45.56	38
Adversarial Defense	CIFAR-10 (test)	Standard Accuracy90.1	30
Adversarial Robustness	CIFAR-10	--	30
Adversarial Purification	CIFAR-10 (test)	Standard Accuracy90.1	24
Image Classification	ImageNet-1k 1.0 (test)	Accuracy (Clean)70.18	17
Image Classification	CIFAR-100 (test)	Standard Accuracy75.22	9

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord