Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Selective Generation on CoQA (PRR (AlignScore))
Loading...
47.2
PRR (AlignScore)
SentenceSAR
6.64
17.17
27.7
38.23
Feb 20, 2025
PRR (AlignScore)
Updated 1mo ago
Evaluation Results
Method
Method
Links
PRR (AlignScore)
SentenceSAR
Model=Llama 8b v3.1
2025.02
47.2
SAR
Model=Llama 8b v3.1
2025.02
46.5
Maximum Sequence Probability
Model=Llama 8b v3.1
2025.02
45
Perplexity
Model=Llama 8b v3.1
2025.02
45
HUQ-SATRMD
Model=Llama 8b v3.1
2025.02
45
EigValLaplacian NLI Score Entail.
Model=Llama 8b v3.1
2025.02
44.4
DegMat NLI Score Entail.
Model=Llama 8b v3.1
2025.02
42
SATRMD+MSP
Model=Llama 8b v3.1
2025.02
41.9
Semantic Entropy
Model=Llama 8b v3.1
2025.02
41.6
Eccentricity NLI Score Entail.
Model=Llama 8b v3.1
2025.02
40.9
Lexical Similarity ROUGE-L
Model=Llama 8b v3.1
2025.02
40.3
EigenScore
Model=Llama 8b v3.1
2025.02
40.2
Factoscope
Model=Llama 8b v3.1
2025.02
24.2
SAPLMA
Model=Llama 8b v3.1
2025.02
8.2
Feedback
Search any
task
Search any
task