Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multidisciplinary knowledge and reasoning on MMMU (dev)
Loading...
25.33
Score
Defender Iter. 3
17.7068
19.6859
21.665
23.6441
Jan 24, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
Defender Iter. 3
Training Strategy=Iter...
2026.01
25.33
Defender Iter. 2
Training Strategy=Iter...
2026.01
23.33
Yang et al.
Training Strategy=Fini...
2026.01
22.67
Ablation (Clean Data)
Training Strategy=Abla...
2026.01
21.33
Base (M_def^(0))
Training Strategy=Base
2026.01
20.67
Liu et al. (Insert)
Training Strategy=Fini...
2026.01
20.67
Defender Iter. 1
Training Strategy=Iter...
2026.01
20
Liu et al. (Add)
Training Strategy=Fini...
2026.01
19.33
Liu et al. (All)
Training Strategy=Fini...
2026.01
18
Feedback
Search any
task
Search any
task