Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Vision Language Model Evaluation on Sycophancy Benchmark

88.8Mean Score

LFM2-VL

54.68863.54472.481.256Apr 27, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
88.87.14345.922.280.09
2026.04
8611.541.445.111.091.06
2026.04
82.85.829.853.710.470.01
2026.04
73.719.121.254.95.986.41
2026.04
61.935.226.540.112.0118.7
2026.04
5622.326.835.38.4522.62