Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Social Dialogue on SOTOPIA Interaction with GPT-4o-mini
Loading...
7.53
GOAL Score
GPT-4-turbo
6.6356
6.8678
7.1
7.3322
Jan 3, 2025
GOAL Score
REL Score
Updated 4d ago
Evaluation Results
Method
Method
Links
GOAL Score
REL Score
GPT-4-turbo
2025.01
7.53
2.54
Llama-8B+BC+SDPO
Alignment=BC + Segment...
2025.01
7.53
2.71
GPT-4o
2025.01
7.47
2.4
Llama-8B+BC+DMPO
Alignment=BC + DMPO
2025.01
7.41
2.54
Llama-8B+BC+ETO
Alignment=BC + ETO
2025.01
7.38
2.56
Llama-8B+BC+DPO
Alignment=BC + DPO
2025.01
7.32
2.7
Llama-8B
2025.01
7.19
2.13
Llama-8B+BC
Alignment=Behavioral C...
2025.01
7.18
2.59
Llama-8B+BC+Preferred-SFT
Alignment=BC + Preferr...
2025.01
7.18
2.52
GPT-4o-mini
2025.01
6.98
2.11
GPT-3.5-turbo
2025.01
6.67
1.84
Feedback
Search any
task
Search any
task