Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question answering on SimpleQA (200 random samples)
Loading...
0.9812
Alignment Score
Llama / Minitron
0.93214
0.95667
0.9812
1.00573
May 16, 2025
Alignment Score
Delta L
Mean L
Updated 2mo ago
Evaluation Results
Method
Method
Links
Alignment Score
Delta L
Mean L
Llama / Minitron
Model pair=Llama / Min...
2025.05
0.9812
0.7157
2.8848
Feedback
Search any
task
Search any
task