Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deep Search Reasoning on XBench DeepSearch2505
Loading...
41
Score
Claude-3.7-Sonnet
5.64
14.82
24
33.18
Feb 3, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Claude-3.7-Sonnet
Model Type=Proprietary
2026.02
41
CSO
Base Model=CK-Pro-8B,...
2026.02
29
GPT-4.1
Model Type=Proprietary
2026.02
27
Step-DPO
Base Model=CK-Pro-8B,...
2026.02
25
IPR
Base Model=CK-Pro-8B,...
2026.02
24
CK-Pro-8B
Mode=SFT
2026.02
23
ETO
Base Model=CK-Pro-8B,...
2026.02
22
RFT
Base Model=CK-Pro-8B,...
2026.02
20
Qwen3-8B
Model Type=Open-Source...
2026.02
7
Feedback
Search any
task
Search any
task