Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human Preference Evaluation on Kuaishou search long-tail query segment (test)
Loading...
48
Good Count
ExpModel
38.64
41.07
43.5
45.93
Mar 26, 2026
Good Count
Same Count
Bad Count
Advantage (%)
Updated 22d ago
Evaluation Results
Method
Method
Links
Good Count
Same Count
Bad Count
Advantage (%)
ExpModel
2026.03
48
114
26
11.7
Base + S1&S2
2026.03
39
133
28
5.5
Feedback
Search any
task
Search any
task