Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Negotiation vs GPT-5.4 High Reasoning Seller on Standard Held-Out Test Set
Loading...
0.4081
Reward
Qwen3-30B-A3B-Instruct-2507-trained
0.093916
0.175483
0.25705
0.338617
Apr 10, 2026
Reward
Deal Rate
Bargained Ratio
Price Overshoot Rate
Updated 5d ago
Evaluation Results
Method
Method
Links
Reward
Deal Rate
Bargained Ratio
Price Overshoot Rate
Qwen3-30B-A3B-Instruct-2507-trained
Params=30B
2026.04
0.4081
75
40.81
0
Qwen3-30B-A3B-Instruct-2507-untrained
Params=30B
2026.04
0.2744
60.5
3.89
32.4
gpt-5.4-high-reasoning
Params=-
2026.04
0.1823
91.4
18.23
0
gpt-5.4-no-reasoning
Params=-
2026.04
0.1458
84.8
16.14
1.6
DeepSeek-V3.1-thinking
Params=671B
2026.04
0.1223
92.6
13.02
0.8
DeepSeek-V3.1-nothink
Params=671B
2026.04
0.1204
90.6
14.77
2.7
Kimi-K2-Thinking
Params=1T
2026.04
0.106
90.6
12.56
1.2
Feedback
Search any
task
Search any
task