Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Bias Mitigation on F^2-Bench

43.1Accuracy (Age)

Baseline

17.72424.31230.937.488Oct 21, 2025
Updated 11d ago

Evaluation Results

MethodLinks
2025.10
43.159.848.757.254.147.36
2025.10
41.844.747.342.643.838.29
2025.10
40.23345.731.743.642.87
2025.10
40.141.538.341.439.534.62
2025.10
38.630.741.130.239.237.7
2025.10
33.230.239.427.534.336.1
2025.10
32.434.631.537.731.930.8
2025.10
27.52925.129.929.825.9
2025.10
21.426.736.326.931.129.2
2025.10
2120.82217.419.918.6
2025.10
19.315.818.921.623.825.7
2025.10
18.712.919.81917.915.8