IDperturb: Enhancing Variation in Synthetic Face Generation via Angular Perturbation

About

Synthetic data has emerged as a practical alternative to authentic face datasets for training face recognition (FR) systems, especially as privacy and legal concerns increasingly restrict the use of real biometric data. Recent advances in identity-conditional diffusion models have enabled the generation of photorealistic and identity-consistent face images. However, many of these models suffer from limited intra-class variation, an essential property for training robust and generalizable FR models. In this work, we propose IDPERTURB, a simple yet effective geometric-driven sampling strategy to enhance diversity in synthetic face generation. IDPERTURB perturbs identity embeddings within a constrained angular region of the unit hyper-sphere, producing a diverse set of embeddings without modifying the underlying generative model. Each perturbed embedding serves as a conditioning vector for a pre-trained diffusion model, enabling the synthesis of visually varied yet identity-coherent face images suitable for training generalizable FR systems. Empirical results demonstrate that training FR on datasets generated using IDPERTURB yields improved performance across multiple FR benchmarks, compared to existing synthetic data generation approaches.

Fadi Boutros, Eduarda Caldeira, Tahar Chettaoui, Naser Damer• 2026

Related benchmarks

Task	Dataset	Result
Face Verification	LFW	Mean Accuracy99.48	365
Face Recognition	LFW	Accuracy99.33	229
Face Verification	AgeDB-30	Accuracy94.03	204
Face Verification	IJB-C	TAR @ FAR=0.01%91.19	191
Face Verification	CFP-FP	Accuracy95.01	153
Face Recognition	CFP-FP	Accuracy94.92	121
Face Verification	AgeDB	Accuracy92.16	104
Face Verification	CA-LFW	Accuracy93.85	98
Face Recognition	CALFW	Accuracy92.92	58
Face Recognition	CPLFW	Accuracy89.27	35

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord