Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Explanation-rating coherence evaluation on Clothing
Loading...
0.8679
GPT Score
Curr-RLCER
0.695884
0.740542
0.7852
0.829858
Apr 7, 2026
GPT Score
Bert Classifier Score
Human Annotator Score
Updated 10d ago
Evaluation Results
Method
Method
Links
GPT Score
Bert Classifier Score
Human Annotator Score
Curr-RLCER
2026.04
0.8679
0.9187
0.9008
NRT
2026.04
0.7541
0.8217
0.7324
PETER
2026.04
0.7191
0.7994
0.7053
CER
2026.04
0.7025
0.7973
0.7261
Feedback
Search any
task
Search any
task