CrypTen: Secure Multi-Party Computation Meets Machine Learning

About

Secure multi-party computation (MPC) allows parties to perform computations on data while keeping that data private. This capability has great potential for machine-learning applications: it facilitates training of machine-learning models on private data sets owned by different parties, evaluation of one party's private model using another party's private data, etc. Although a range of studies implement machine-learning models via secure MPC, such implementations are not yet mainstream. Adoption of secure MPC is hampered by the absence of flexible software frameworks that "speak the language" of machine-learning researchers and engineers. To foster adoption of secure MPC in machine learning, we present CrypTen: a software framework that exposes popular secure MPC primitives via abstractions that are common in modern machine-learning frameworks, such as tensor computations, automatic differentiation, and modular neural networks. This paper describes the design of CrypTen and measure its performance on state-of-the-art models for text classification, speech recognition, and image classification. Our benchmarks show that CrypTen's GPU support and high-performance communication between (an arbitrary number of) parties allows it to perform efficient private evaluation of modern machine-learning models under a semi-honest threat model. For example, two parties using CrypTen can securely predict phonemes in speech recordings using Wav2Letter faster than real-time. We hope that CrypTen will spur adoption of secure MPC in the machine-learning community.

Brian Knott, Shobha Venkataraman, Awni Hannun, Shubho Sengupta, Mark Ibrahim, Laurens van der Maaten• 2021

Related benchmarks

Task	Dataset	Result
Natural Language Understanding	GLUE	--	551
Natural Language Understanding	GLUE (test)	SST-2 Accuracy-1.04	416
Computation	Softmax	Latency0.71	8
Private text generation	GPT2-base (124M)	Usage Fraction100	7
Private text generation	T5 138M	Memory Fraction100	7
Private Inference	T5 138M	Embed Inference Time (s)323.5	7
Private Inference	GPT2-base (124M)	Embed Inference Time (s)321.4	7
Retrieval	embedllm	Communication Volume (Bytes)2.50e+7	6
Privacy-Preserving Inference	BERT Base (inference)	GeLU Time (s)16.46	4
Privacy-Preserving Inference	BERT Large (inference)	GeLU Time (s)27.881	4

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord