Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Out-of-scope Refusal Evaluation on MMMU (out-of-scope test)

0.18Refusal Rate

Prompt-based

0.14820.362850.57750.79215Jan 31, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.18
2026.01
0.31
2026.01
0.81
2026.01
0.885
2026.01
0.92
2026.01
0.93
2026.01
0.965
2026.01
0.97
2026.01
0.975