Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
String-level response similarity on RA-QA Single-Verify, Discriminative tasks
Loading...
94
BERTScore
RAMoEA-QA
-0.64
23.93
48.5
73.07
Mar 6, 2026
BERTScore
METEOR
Updated 1mo ago
Evaluation Results
Method
Method
Links
BERTScore
METEOR
RAMoEA-QA
2026.03
94
92.64
CareAQA-operaGT
Backbone=OPERA-GT
2026.03
91
89.95
CareAQA-operaCT
Backbone=OPERA-CT
2026.03
87
86.08
PENGI
2026.03
3
4.11
Feedback
Search any
task
Search any
task