Share your thoughts, 1 month free Claude Pro on usSee more

Multi-agent interaction and social reasoning on Werewolf MultiAgentBench

55.75Task Performance

ETI

Updated 3mo ago

Evaluation Results

Method	Links
ETI 2026.04		55.75	65.2
ETI 2026.04		49.97	59.52
QWEN 2026.04		43.28	60.2
ETI 2026.04		36.46	57.56
ETI 2026.04		29.54	55.56
GPT 2026.04		25.26	54.32