Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

On Measuring Fairness in Generative Models

About

Recently, there has been increased interest in fair generative models. In this work, we conduct, for the first time, an in-depth study on fairness measurement, a critical component in gauging progress on fair generative models. We make three contributions. First, we conduct a study that reveals that the existing fairness measurement framework has considerable measurement errors, even when highly accurate sensitive attribute (SA) classifiers are used. These findings cast doubts on previously reported fairness improvements. Second, to address this issue, we propose CLassifier Error-Aware Measurement (CLEAM), a new framework which uses a statistical model to account for inaccuracies in SA classifiers. Our proposed CLEAM reduces measurement errors significantly, e.g., 4.98% $\rightarrow$ 0.62% for StyleGAN2 w.r.t. Gender. Additionally, CLEAM achieves this with minimal additional overhead. Third, we utilize CLEAM to measure fairness in important text-to-image generator and GANs, revealing considerable biases in these models that raise concerns about their applications. Code and more resources: https://sutd-visual-computing-group.github.io/CLEAM/.

Christopher T. H. Teo, Milad Abdollahzadeh, Ngai-Man Cheung• 2023

Related benchmarks

TaskDatasetResultRank
Bias EstimationGenData StyleSwin 1.0 (test)
Point Estimate (p-hat)0.677
18
Fairness measurement estimationGenData-StyleGAN2 1.0 (BlackHair)
Epsilon P0.16
15
Class Probability EstimationGenData StyleGAN2 Gender
Point Estimate Error Rate0.62
12
Gender Bias EstimationGenData-SDM
Point Estimate0.548
8
Fairness measurement estimationGenData-StyleGAN2 Gender 1.0
p_hat0.638
3
Fairness measurement estimationGenData-StyleSwin Gender 1.0
p_hat Estimate0.648
3
Fairness measurement estimationGenData-StyleSwin BlackHair 1.0
p_hat0.659
3
Showing 7 of 7 rows

Other info

Code

Follow for update