Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coverage analysis on LLM pairwise comparison experiment
Loading...
97
Coverage (alpha=0.05)
PPI
94.92
95.46
96
96.54
Mar 9, 2024
Coverage (alpha=0.05)
Coverage (alpha=0.1)
Coverage (alpha=0.15)
Coverage (alpha=0.2)
Updated 1d ago
Evaluation Results
Method
Method
Links
Coverage (alpha=0.05)
Coverage (alpha=0.1)
Coverage (alpha=0.15)
Coverage (alpha=0.2)
PPI
2024.03
97
92
88
82
Classic
2024.03
96
90
85
80
PPI++
2024.03
95
90
85
79
Feedback
Search any
task
Search any
task