Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Selective Generation on TriviaQA (PRR AlignScore)
Loading...
0.714
PRR (AlignScore)
DegMat NLI Score Entail.
-0.0764
0.1288
0.334
0.5392
Feb 20, 2025
PRR (AlignScore)
Updated 1mo ago
Evaluation Results
Method
Method
Links
PRR (AlignScore)
DegMat NLI Score Entail.
Model=Llama 8b v3.1
2025.02
0.714
SAR
Model=Llama 8b v3.1
2025.02
0.71
SentenceSAR
Model=Llama 8b v3.1
2025.02
0.703
SATRMD+MSP
Model=Llama 8b v3.1
2025.02
0.702
Perplexity
Model=Llama 8b v3.1
2025.02
0.689
Maximum Sequence Probability
Model=Llama 8b v3.1
2025.02
0.687
EigValLaplacian NLI Score Entail.
Model=Llama 8b v3.1
2025.02
0.669
Semantic Entropy
Model=Llama 8b v3.1
2025.02
0.669
Eccentricity NLI Score Entail.
Model=Llama 8b v3.1
2025.02
0.654
HUQ-SATRMD
Model=Llama 8b v3.1
2025.02
0.646
Lexical Similarity ROUGE-L
Model=Llama 8b v3.1
2025.02
0.621
EigenScore
Model=Llama 8b v3.1
2025.02
0.619
SAPLMA
Model=Llama 8b v3.1
2025.02
0.522
Factoscope
Model=Llama 8b v3.1
2025.02
-0.046
Feedback
Search any
task
Search any
task