Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Real-world understanding on RealWorldQA (Score)
Loading...
70.07
Score
Defender Iter. 3
67.6156
68.2528
68.89
69.5272
Jan 24, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
Defender Iter. 3
Training Strategy=Iter...
2026.01
70.07
Ablation (Clean Data)
Training Strategy=Abla...
2026.01
69.28
Defender Iter. 1
Training Strategy=Iter...
2026.01
69.28
Defender Iter. 2
Training Strategy=Iter...
2026.01
69.28
Liu et al. (All)
Training Strategy=Fini...
2026.01
68.76
Yang et al.
Training Strategy=Fini...
2026.01
68.24
Liu et al. (Insert)
Training Strategy=Fini...
2026.01
68.24
Liu et al. (Add)
Training Strategy=Fini...
2026.01
67.97
Base (M_def^(0))
Training Strategy=Base
2026.01
67.71
Feedback
Search any
task
Search any
task