Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Social Simulation on Sotopia standard (test)
Loading...
8.92
Goal Score
The Stackelberg Speaker + MetaMind
7.568
7.919
8.27
8.621
Oct 10, 2025
Goal Score
Overall Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Goal Score
Overall Score
The Stackelberg Speaker + MetaMind
Base Agent=MetaMind
2025.10
8.92
4.2
MetaMind
2025.10
8.7
4.03
The Stackelberg Speaker + ReAct
Base Agent=ReAct
2025.10
8.58
3.95
ReCon
2025.10
8.45
3.88
ReAct
2025.10
8.43
3.85
SDPO
2025.10
8.13
3.63
SOTOPIA-Ω
2025.10
8.07
3.67
SOTOPIA-π
2025.10
7.62
3.44
Feedback
Search any
task
Search any
task