Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Judge Evaluation on LLM-to-LLM Evaluation Reference: GPT-5.2
Loading...
0.84
Global Correlation (r)
GPT-5-mini
0.476
0.5705
0.665
0.7595
Mar 12, 2026
Global Correlation (r)
Within-Group Correlation (r)
Performance Gap (%)
Performance Recovery (%)
Updated 2mo ago
Evaluation Results
Method
Method
Links
Global Correlation (r)
Within-Group Correlation (r)
Performance Gap (%)
Performance Recovery (%)
GPT-5-mini
Relative Cost=7x cheaper
2026.03
0.84
0.54
35
52
GPT-4.1-nano
Relative Cost=35x cheaper
2026.03
0.49
0.29
42
2
Feedback
Search any
task
Search any
task