Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
String-level response similarity on RA-QA Global, Discriminative tasks
Loading...
0.9
BERTScore
RAMoEA-QA
-0.1296
0.1377
0.405
0.6723
Mar 6, 2026
BERTScore
METEOR
Updated 1mo ago
Evaluation Results
Method
Method
Links
BERTScore
METEOR
RAMoEA-QA
2026.03
0.9
88.38
CareAQA-operaGT
Backbone=OPERA-GT
2026.03
0.89
87.05
CareAQA-operaCT
Backbone=OPERA-CT
2026.03
0.87
84.89
CareAQA-operaGT
Backbone=OPERA-GT
2026.03
0.87
83.22
CareAQA-operaCT
Backbone=OPERA-CT
2026.03
0.86
83.15
RAMoEA-QA
2026.03
0.83
81.02
PENGI
2026.03
-0.01
2.61
PENGI
2026.03
-0.09
0
Feedback
Search any
task
Search any
task