Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Selective Generation on MMLU (PRR Accuracy)
Loading...
81.6
PRR Accuracy
SATRMD+MSP
17.12
33.86
50.6
67.34
Feb 20, 2025
PRR Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
PRR Accuracy
SATRMD+MSP
Model=Llama 8b v3.1
2025.02
81.6
Factoscope
Model=Llama 8b v3.1
2025.02
72.7
HUQ-SATRMD
Model=Llama 8b v3.1
2025.02
60.9
SAPLMA
Model=Llama 8b v3.1
2025.02
48.1
Maximum Sequence Probability
Model=Llama 8b v3.1
2025.02
40.5
EigValLaplacian NLI Score Entail.
Model=Llama 8b v3.1
2025.02
34.4
SentenceSAR
Model=Llama 8b v3.1
2025.02
34.3
Eccentricity NLI Score Entail.
Model=Llama 8b v3.1
2025.02
31.2
Lexical Similarity ROUGE-L
Model=Llama 8b v3.1
2025.02
31.1
Perplexity
Model=Llama 8b v3.1
2025.02
30.8
SAR
Model=Llama 8b v3.1
2025.02
28.4
DegMat NLI Score Entail.
Model=Llama 8b v3.1
2025.02
22.4
Semantic Entropy
Model=Llama 8b v3.1
2025.02
22
EigenScore
Model=Llama 8b v3.1
2025.02
19.6
Feedback
Search any
task
Search any
task