Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Short-form generation LOO
Loading...
64
PRR
MSP
10.96
24.73
38.5
52.27
Apr 13, 2026
PRR
Updated 4d ago
Evaluation Results
Method
Method
Links
PRR
MSP
Model=Gemma-2-9B
2026.04
64
HUQ-SATRMD
Model=Gemma-2-9B
2026.04
62
HBO
Model=Llama 3.1-8B
2026.04
61
HBO
Model=Gemma-2-9B
2026.04
61
HUQ-SATMD
Model=Llama 3.1-8B
2026.04
59
MSP
Model=Llama 3.1-8B
2026.04
57
HUQ-SATRMD
Model=Llama 3.1-8B
2026.04
55
HUQ-SATMD
Model=Gemma-2-9B
2026.04
52
SAPLMA (mid)
Model=Llama 3.1-8B
2026.04
46
SATRMD-MSP
Model=Gemma-2-9B
2026.04
40
SAPLMA (mid)
Model=Gemma-2-9B
2026.04
35
SATRMD-MSP
Model=Llama 3.1-8B
2026.04
30
SATMD-MSP
Model=Llama 3.1-8B
2026.04
28
SATMD-MSP
Model=Gemma-2-9B
2026.04
13
Feedback
Search any
task
Search any
task