Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human process distribution matching on Sampling Task
Loading...
0.15
Mean Absolute Cohen's d
Centaur
0.1056
0.4053
0.705
1.0047
May 7, 2026
Mean Absolute Cohen's d
Updated 26d ago
Evaluation Results
Method
Method
Links
Mean Absolute Cohen's d
Centaur
Parameter Scale=70B
2026.05
0.15
GPT-5
2026.05
0.27
Base Qwen
Parameter Scale=1.5B
2026.05
0.89
Claude Sonnet
2026.05
1.15
Gemini 2.5 Pro
2026.05
1.26
Feedback
Search any
task
Search any
task