Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Correlation Analysis on Reasoning Benchmark Suite (AIME, GSM8K, MMLU, GPQA)
Loading...
0.741
Pearson r
TRACE
0.12324
0.28362
0.444
0.60438
May 28, 2026
Pearson r
Spearman rho
Updated 5d ago
Evaluation Results
Method
Method
Links
Pearson r
Spearman rho
TRACE
2026.05
0.741
0.755
Perplexity
2026.05
0.221
0.145
MTLD
2026.05
0.207
0.244
Token Length
2026.05
0.147
0.186
Feedback
Search any
task
Search any
task