Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Query Auto-Completion on Human Evaluation Set
Loading...
69.9
Item-wise Score
Full
65.116
66.358
67.6
68.842
Feb 1, 2026
Item-wise Score
Pairwise Preference Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Item-wise Score
Pairwise Preference Rate
Full
Deployment Strategy=hy...
2026.02
69.9
0.4
SFT + DPO w/o Eng
Deployment Strategy=hy...
2026.02
69.8
0.69
SFT-only
Deployment Strategy=hy...
2026.02
68.9
0.5
LTR Baseline
Deployment Strategy=hy...
2026.02
65.3
-
Feedback
Search any
task
Search any
task