Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Social Intelligence Assessment on SOTOPIA hard episodes (test)

7.21GOA Score

GPT-5

3.23724.26865.36.3314Dec 6, 2024Feb 21, 2025May 10, 2025Jul 26, 2025Oct 12, 2025Dec 28, 2025Mar 16, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
7.21-------
2026.03
7.19------3.54
2026.03
7.14------3.48
2026.03
7.11-------
2026.03
6.97------3.46
2026.03
6.97-------
2026.03
6.63-------
6.49------3.22
2024.12
6.4-0.0200.822.324.528.943.28
6.33------3.09
6.21------3.01
2024.12
6.07-0.0500.731.833.418.642.95
5.86------2.73
2026.03
5.7-------
2024.12
3.77-0.1900.290.852.888.412.29
2024.12
3.39-0.5-0.01-0.160.62.218.631.85