Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reward Scoring on NB-curated (evaluation set)
Loading...
33.64
Mean Reward Score
TOMPA
2.2112
10.3706
18.53
26.6894
Apr 3, 2026
Mean Reward Score
Min Reward Score
Max Reward Score
Beat Gold Rate
Updated 13d ago
Evaluation Results
Method
Method
Links
Mean Reward Score
Min Reward Score
Max Reward Score
Beat Gold Rate
TOMPA
Reward Model=Llama-3.1...
2026.04
33.64
32.16
35.01
98
Gold Answer (GPT-5)
Reward Model=Llama-3.1...
2026.04
17.48
-
-
-
TOMPA
Reward Model=Qwen3-8B...
2026.04
16.86
15.86
17.78
98
Gold Answer (GPT-5)
Reward Model=Qwen3-8B...
2026.04
8.12
-
-
-
Random OOD
Reward Model=Qwen3-8B...
2026.04
7.94
9.05
6.87
1
Random OOD
Reward Model=Llama-3.1...
2026.04
3.42
4.16
2.33
0
Feedback
Search any
task
Search any
task