Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Social Simulation on Sotopia hard (test)
Loading...
7.59
Goal Score
The Stackelberg Speaker + MetaMind
5.25
5.8575
6.465
7.0725
Oct 10, 2025
Goal Score
Overall Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Goal Score
Overall Score
The Stackelberg Speaker + MetaMind
Base Agent=MetaMind
2025.10
7.59
3.85
MetaMind
2025.10
7.16
3.6
The Stackelberg Speaker + ReAct
Base Agent=ReAct
2025.10
7.12
3.55
ReCon
2025.10
7.08
3.48
ReAct
2025.10
7.06
3.45
SDPO
2025.10
6.35
3.14
SOTOPIA-Ω
2025.10
6.31
3.03
SOTOPIA-π
2025.10
5.34
2.76
Feedback
Search any
task
Search any
task