Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
PMI ranking estimation on ChaosNLI (500 held-out pairs)
Loading...
0.82
Spearman Rho
PromptNCE
0.4768
0.5659
0.655
0.7441
May 20, 2026
Spearman Rho
Updated 12d ago
Evaluation Results
Method
Method
Links
Spearman Rho
PromptNCE
Model=Claude Sonnet 4,...
2026.05
0.82
PromptNCE (emp.)
Model=Claude Sonnet 4,...
2026.05
0.81
Decomposed PMI
Model=Claude Sonnet 4,...
2026.05
0.75
InfoNCE (emp.)
Model=Claude Sonnet 4,...
2026.05
0.73
PromptNCE (emp.)
Model=GPT-5.2, Evaluat...
2026.05
0.73
InfoNCE
Model=Claude Sonnet 4,...
2026.05
0.73
MarginalNCE
Model=Claude Sonnet 4,...
2026.05
0.73
PromptNCE
Model=GPT-5.2, Evaluat...
2026.05
0.73
Direct PMI
Model=Claude Sonnet 4,...
2026.05
0.72
Decomposed PMI (emp.)
Model=Claude Sonnet 4,...
2026.05
0.7
InfoNCE (emp.)
Model=GPT-5.2, Evaluat...
2026.05
0.7
Decomposed PMI
Model=GPT-5.2, Evaluat...
2026.05
0.7
Decomposed PMI (emp.)
Model=GPT-5.2, Evaluat...
2026.05
0.69
InfoNCE
Model=GPT-5.2, Evaluat...
2026.05
0.69
MarginalNCE
Model=GPT-5.2, Evaluat...
2026.05
0.69
Direct PMI
Model=GPT-5.2, Evaluat...
2026.05
0.49
Feedback
Search any
task
Search any
task