Share your thoughts, 1 month free Claude Pro on usSee more

Negotiation on AmazonHistoryPrice (gpt-5.4-high-reasoning seller, test)

0.408Reward

Qwen3-30B-A3B-Instruct-2507-trained

Updated 3mo ago

Evaluation Results

Method	Links
Qwen3-30B-A3B-Instruct-2507-trained 2026.04		0.408	75	40.8
gpt-5.4-high-reasoning 2026.04		0.182	91.4	18.2
gpt-5.4-no-reasoning 2026.04		0.146	84.8	16.1
DeepSeek-V3.1-thinking 2026.04		0.122	92.6	13
DeepSeek-V3.1-nothink 2026.04		0.12	90.6	14.8
Kimi-K2-Thinking 2026.04		0.106	90.6	12.6
Qwen3-30B-A3B-Instruct-2507-untrained 2026.04		-0.274	60.5	3.9