Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Information Seeking on 20Q Breeds weighted (test)
Loading...
47.8
Worst-case Weighted Payoff
DP
22.216
28.858
35.5
42.142
Feb 2, 2026
Worst-case Weighted Payoff
Updated 4d ago
Evaluation Results
Method
Method
Links
Worst-case Weighted Payoff
DP
Backbone LLM=Qwen 2.5...
2026.02
47.8
UoT
Backbone LLM=Qwen 2.5...
2026.02
47
DC
Backbone LLM=Qwen 2.5...
2026.02
45.7
DC
Backbone LLM=GPT 4.1
2026.02
41.1
DP
Backbone LLM=GPT 4.1
2026.02
36.9
GoT
Backbone LLM=Qwen 2.5...
2026.02
32.3
UoT
Backbone LLM=GPT 4.1
2026.02
32.1
GoT
Backbone LLM=GPT 4.1
2026.02
23.2
Feedback
Search any
task
Search any
task