Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Difficulty Correlation with Human Performance on CMCQRD n=787
Loading...
0.53
Pearson Correlation
LLM compare
0.478
0.4915
0.505
0.5185
Dec 16, 2025
Pearson Correlation
Spearman Correlation
Kendall Correlation
Updated 4d ago
Evaluation Results
Method
Method
Links
Pearson Correlation
Spearman Correlation
Kendall Correlation
LLM compare
Model=Gemini 2.5 Pro
2025.12
0.53
0.56
0.38
LLM compare
Model=OpenAI o3
2025.12
0.48
0.51
0.35
Feedback
Search any
task
Search any
task