Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Judge Performance Evaluation on SynJudge
Loading...
0.54
TCC (RMSE)
QwenVL_trained
0.5212
0.6481
0.775
0.9019
Jun 11, 2025
TCC (RMSE)
ICC (RMSE)
IQ (RMSE)
ITS (RMSE)
Top-1 Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
TCC (RMSE)
ICC (RMSE)
IQ (RMSE)
ITS (RMSE)
Top-1 Accuracy
QwenVL_trained
Training Status=finetuned
2025.06
0.54
0.72
0.68
0.67
95.4
InternVL_trained
Training Status=finetuned
2025.06
0.55
0.7
0.72
0.72
94.5
QwenVL
Training Status=zero-shot
2025.06
0.81
1.09
0.96
1.2
87.5
InternVL
Training Status=zero-shot
2025.06
0.96
0.9
1.06
1.03
86.6
GPT-4o
Training Status=zero-shot
2025.06
1.01
1.02
0.98
1.18
86.5
Feedback
Search any
task
Search any
task