Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Short-form generation on Short-form generation 1D-SameTask
Loading...
0.64
PRR
MSP
-0.3896
-0.1223
0.145
0.4123
Apr 13, 2026
PRR
Updated 5d ago
Evaluation Results
Method
Method
Links
PRR
MSP
Model=Gemma-2-9B
2026.04
0.64
HBO
Model=Gemma-2-9B
2026.04
0.64
MSP
Model=Llama 3.1-8B
2026.04
0.57
HBO
Model=Llama 3.1-8B
2026.04
0.57
SATRMD-MSP
Model=Gemma-2-9B
2026.04
0.41
SAPLMA (mid)
Model=Llama 3.1-8B
2026.04
0.3
HUQ-SATRMD
Model=Gemma-2-9B
2026.04
0.23
SAPLMA (mid)
Model=Gemma-2-9B
2026.04
0.11
SATRMD-MSP
Model=Llama 3.1-8B
2026.04
0.05
SATMD-MSP
Model=Llama 3.1-8B
2026.04
0.04
SATMD-MSP
Model=Gemma-2-9B
2026.04
-0.07
HUQ-SATRMD
Model=Llama 3.1-8B
2026.04
-0.08
HUQ-SATMD
Model=Llama 3.1-8B
2026.04
-0.12
HUQ-SATMD
Model=Gemma-2-9B
2026.04
-0.35
Feedback
Search any
task
Search any
task