Deep leakage from gradients

About

With the development of artificial intelligence technology, Federated Learning (FL) model has been widely used in many industries for its high efficiency and confidentiality. Some researchers have explored its confidentiality and designed some algorithms to attack training data sets, but these algorithms all have their own limitations. Therefore, most people still believe that local machine learning gradient information is safe and reliable. In this paper, an algorithm based on gradient features is designed to attack the federated learning model in order to attract more attention to the security of federated learning systems. In federated learning system, gradient contains little information compared with the original training data set, but this project intends to restore the original training image data through gradient information. Convolutional Neural Network (CNN) has excellent performance in image processing. Therefore, the federated learning model of this project is equipped with Convolutional Neural Network structure, and the model is trained by using image data sets. The algorithm calculates the virtual gradient by generating virtual image labels. Then the virtual gradient is matched with the real gradient to restore the original image. This attack algorithm is written in Python language, uses cat and dog classification Kaggle data sets, and gradually extends from the full connection layer to the convolution layer, thus improving the universality. At present, the average squared error between the data recovered by this algorithm and the original image information is approximately 5, and the vast majority of images can be completely restored according to the gradient information given, indicating that the gradient of federated learning system is not absolutely safe and reliable.

Yaqiong Mu• 2022

Related benchmarks

Task	Dataset	Result
Sentiment Classification	SST-2	Accuracy94	190
Image Classification	STL10	Accuracy80.2	103
Text reconstruction from gradients	Rotten Tomatoes	ROUGE-119.7	68
Adjacency Matrix Reconstruction	Graph Data Instances	AUC87.3	45
Node Feature Reconstruction	Graph Data Instances	MSE0.9537	45
Language Modeling	AG-News	PPL4.76	39
Language Modeling	Enron Dataset	Perplexity3.4	39
Sequence Reconstruction	COLA	ROUGE-158.6	32
Sequence Reconstruction	MIMIC-III	ROUGE-113.4	32
Text Classification	Tweet Sentiment	F1 Score69	31

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord