Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on HotpotQA (Solve Rate and Executability)
Loading...
88.12
Solve Rate
CapFlow
80.9024
82.7762
84.65
86.5238
Feb 11, 2026
Solve Rate
Executability
Updated 4d ago
Evaluation Results
Method
Method
Links
Solve Rate
Executability
CapFlow
Type=Learning, Setting...
2026.02
88.12
-
CapFlow
Type=Learning, Setting...
2026.02
86.37
-
ScoreFlow
Type=Learning, Setting...
2026.02
86
-
AFlow
Type=Refinement, Setti...
2026.02
85.87
-
ScoreFlow
Type=Learning, Setting...
2026.02
85.37
-
ADAS
Type=Refinement, Setti...
2026.02
82.84
-
CoT-SC
Type=Manual, Setting=M...
2026.02
82.07
-
CoT
Type=Manual, Setting=M...
2026.02
81.75
-
SPP
Type=Manual, Setting=M...
2026.02
81.62
-
GPT-4o-mini
Type=Manual, Setting=M...
2026.02
81.54
-
Self-Refine
Type=Manual, Setting=M...
2026.02
81.18
-
Feedback
Search any
task
Search any
task