Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Selective Generation on MedQUAD (PRR Metrics)
Loading...
46.6
PRR (ROUGE-L)
SATRMD+MSP
-0.304
11.873
24.05
36.227
Feb 20, 2025
PRR (ROUGE-L)
PRR (AlignScore)
Updated 1mo ago
Evaluation Results
Method
Method
Links
PRR (ROUGE-L)
PRR (AlignScore)
SATRMD+MSP
Model=Llama 8b v3.1
2025.02
46.6
57.5
Perplexity
Model=Llama 8b v3.1
2025.02
42.5
43.8
SAPLMA
Model=Llama 8b v3.1
2025.02
40.7
49
HUQ-SATRMD
Model=Llama 8b v3.1
2025.02
38.6
50.6
Factoscope
Model=Llama 8b v3.1
2025.02
35.8
42.8
Maximum Sequence Probability
Model=Llama 8b v3.1
2025.02
29.7
35.6
SAR
Model=Llama 8b v3.1
2025.02
28.6
19.2
Lexical Similarity ROUGE-L
Model=Llama 8b v3.1
2025.02
25.2
13.2
Semantic Entropy
Model=Llama 8b v3.1
2025.02
7.5
0.7
Eccentricity NLI Score Entail.
Model=Llama 8b v3.1
2025.02
7
6
DegMat NLI Score Entail.
Model=Llama 8b v3.1
2025.02
6.6
16.2
EigValLaplacian NLI Score Entail.
Model=Llama 8b v3.1
2025.02
5.6
16
EigenScore
Model=Llama 8b v3.1
2025.02
5
4.3
SentenceSAR
Model=Llama 8b v3.1
2025.02
1.5
3.3
Feedback
Search any
task
Search any
task