Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Selective Generation on SciQ (PRR (AlignScore))
Loading...
65.3
PRR (AlignScore)
HUQ-SATRMD
17.876
30.188
42.5
54.812
Feb 20, 2025
PRR (AlignScore)
Updated 1mo ago
Evaluation Results
Method
Method
Links
PRR (AlignScore)
HUQ-SATRMD
Model=Llama 8b v3.1
2025.02
65.3
Maximum Sequence Probability
Model=Llama 8b v3.1
2025.02
58.2
SentenceSAR
Model=Llama 8b v3.1
2025.02
54.3
SATRMD+MSP
Model=Llama 8b v3.1
2025.02
54.2
Semantic Entropy
Model=Llama 8b v3.1
2025.02
46.6
DegMat NLI Score Entail.
Model=Llama 8b v3.1
2025.02
44.6
Eccentricity NLI Score Entail.
Model=Llama 8b v3.1
2025.02
44.4
SAR
Model=Llama 8b v3.1
2025.02
44
EigValLaplacian NLI Score Entail.
Model=Llama 8b v3.1
2025.02
39.8
SAPLMA
Model=Llama 8b v3.1
2025.02
38.8
EigenScore
Model=Llama 8b v3.1
2025.02
37.3
Lexical Similarity ROUGE-L
Model=Llama 8b v3.1
2025.02
36
Factoscope
Model=Llama 8b v3.1
2025.02
31.6
Perplexity
Model=Llama 8b v3.1
2025.02
19.7
Feedback
Search any
task
Search any
task