Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Understandability on Understandability Experiment New word strategy
Loading...
7
Significance Count
Claude
2.84
3.92
5
6.08
May 7, 2026
Significance Count
Adjusted R2
Spearman Correlation Significance
Majority Fit
Updated 26d ago
Evaluation Results
Method
Method
Links
Significance Count
Adjusted R2
Spearman Correlation Significance
Majority Fit
Claude
Model Identifier=claud...
2026.05
7
0.9
-
-
Majority Vote (MUM)
Evaluation Protocol=Ma...
2026.05
7
-
-
-
GPT-4o-m
Model Identifier=gpt-4...
2026.05
6
0.46
-
-
GPT-4o
Model Identifier=gpt-4o
2026.05
5
0.67
-
-
Llama
Model Identifier=llama...
2026.05
5
0.81
-
-
Grok
Model Identifier=grok-...
2026.05
5
0.35
-
-
Mistral
Model Identifier=Minis...
2026.05
3
0.87
-
-
Qwen
Model Identifier=Qwen3...
2026.05
3
0.87
-
-
Feedback
Search any
task
Search any
task