Share your thoughts, 1 month free Claude Pro on usSee more

Persuasive Dialogue on ToM-BPD interactive evaluation

55.23Win Rate: Identification

Qwen3-8B + TTBYS vs. GPT-5

Updated 2mo ago

Evaluation Results

Method	Links
Qwen3-8B + TTBYS vs. GPT-5 2026.05		55.23	26.47	34.56	25.12	42.78	26.31	35.44	29.18	28.62	31.04
Qwen3-8B + TTBYS vs. GPT-5 + CoT 2026.05		45.19	29.44	37.33	25.78	42.51	36.29	33.22	27.65	30.11	28.47
GPT-5 vs. GPT-5 + CoT 2026.05		34.87	50.12	30.45	28.33	32.21	38.76	32.58	28.14	29.04	30.22