Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Quality scoring on PANDABENCH (Hard set)
Loading...
0.36
SRCC
PANDA
-0.0144
0.0828
0.18
0.2772
Apr 13, 2026
SRCC
PLCC
Updated 5d ago
Evaluation Results
Method
Method
Links
SRCC
PLCC
PANDA
2026.04
0.36
0.38
Linear Probe
Type=Baseline
2026.04
0.22
0.23
DepictQA†
Access=Open-source, Tr...
2026.04
0.18
0.17
Gemini 2.5 Pro
Access=Closed-source
2026.04
0.1
0.14
GPT-5 Mini
Access=Closed-source
2026.04
0.09
0.13
GPT-4o
Access=Closed-source
2026.04
0.06
0.08
GPT-5 Nano
Access=Closed-source
2026.04
0.02
0.04
Attentive Probe
Type=Baseline
2026.04
0.02
0.02
Random
Type=Baseline
2026.04
0
0
Feedback
Search any
task
Search any
task