Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Robust Safety and Utility Evaluation in Federated Learning on MaliciousGen & WildChat

81.35Rule Score

Step-Level

44.74254.24663.7573.254Jan 12, 2026
Updated 3mo ago

Evaluation Results

MethodLinks
2026.01
81.3552.5-1.713.98
2026.01
8055.77-1.874.13
2026.01
79.6251.35-1.784.12
2026.01
48.655.19-3.483.81
2026.01
48.085.77-3.593.82
2026.01
47.314.81-3.494.12
2026.01
46.735.96-3.483.72
2026.01
46.155.01-3.623.95