Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Knowledge-Intensive Reasoning on 2Wiki
Loading...
0.89
Average Score
AutoTraj
0.7236
0.7668
0.81
0.8532
Jan 30, 2026
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Score
AutoTraj
Category=SFT-RL TIR Me...
2026.01
0.89
R1-Searcher
Category=RL-only TIR M...
2026.01
0.87
AutoTIR
Category=RL-only TIR M...
2026.01
0.86
ReSearch
Category=RL-only TIR M...
2026.01
0.81
Tool-Star-SFT
Category=SFT-only TIR...
2026.01
0.79
Qwen2.5-7B-Instruct
Framework=Multi-Dimens...
2026.01
0.78
Tool-Star
Category=SFT-RL TIR Me...
2026.01
0.77
Vanilla SFT-RL TIR
Category=SFT-RL TIR Me...
2026.01
0.75
ToRL
Category=RL-only TIR M...
2026.01
0.73
Feedback
Search any
task
Search any
task