Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reward Prediction on SIQA (out-of-domain)
Loading...
76.89
Accuracy
MemReward
72.5012
73.6406
74.78
75.9194
Mar 13, 2026
Accuracy
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
MemReward
Backbone=Qwen2.5-3B-In...
2026.03
76.89
R1-Oracle
Backbone=Qwen2.5-3B-In...
2026.03
76.89
R1-Oracle
Backbone=Qwen2.5-1.5B-...
2026.03
74.89
R1-p
Backbone=Qwen2.5-3B-In...
2026.03
74.67
MemReward
Backbone=Qwen2.5-1.5B-...
2026.03
74.44
R1-p
Backbone=Qwen2.5-1.5B-...
2026.03
72.67
Feedback
Search any
task
Search any
task