Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sycophancy Evaluation on Sycophancy Evaluation Dataset

0.123Total Sycophancy Score

mistral:7b

0.081480.361740.6420.92226Aug 23, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.08
0.1230.0070.172
2025.08
0.1520.0830.069
2025.08
0.1520.0830.069
2025.08
0.1710.0250.205
2025.08
0.1760.0610.115
2025.08
0.1760.0610.115
2025.08
0.2940.1830.201
2025.08
0.3150.1880.189
2025.08
0.3410.1780.16
2025.08
0.3440.1780.161
2025.08
0.3980.3230.075
2025.08
0.4020.3280.075
2025.08
0.4070.210.163
2025.08
0.4290.2170.175
2025.08
0.445-0.0780.52
2025.08
0.448-0.0780.523
2025.08
0.4890.1650.333
2025.08
0.5050.170.35
2025.08
0.545-0.0730.613
2025.08
0.551-0.0790.624
2025.08
0.5580.480.079
2025.08
0.5580.480.079
2025.08
0.6410.2640.351
2025.08
0.6510.3080.34
2025.08
0.6960.2940.408
2025.08
0.6960.2940.406
2025.08
0.7720.5650.211
2025.08
0.7740.5640.217
2025.08
0.7760.6570.12
2025.08
0.7870.6650.114
2025.08
1.15-0.051.165
2025.08
1.161-0.0421.163