Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Negotiation on Hospital (test)
Loading...
100
Success Rate
Vanilla (LLM)
89.6
92.3
95
97.7
May 26, 2026
Success Rate
Utility
Rounds
Updated 7d ago
Evaluation Results
Method
Method
Links
Success Rate
Utility
Rounds
Vanilla (LLM)
Backbone=Qwen3.5-Plus
2026.05
100
35
4.5
Random
2026.05
100
30
4.2
IQL
Backbone=Qwen3.5-Plus
2026.05
100
0
3.1
IQL+JPO
Backbone=Qwen2.5-7B-In...
2026.05
100
35
7.4
IQL+SFT+JPO
Backbone=Qwen2.5-7B-In...
2026.05
100
45
5.5
IQL+SFT
Backbone=Qwen2.5-7B-In...
2026.05
95
20
6
Vanilla (SLM)
Backbone=Qwen2.5-7B-In...
2026.05
90
40.3
4.3
Feedback
Search any
task
Search any
task