Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Next-action prediction on ScienceWorld
Loading...
50.34
Accuracy
RL w/ ACT
24.2048
30.9899
37.775
44.5601
Mar 9, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
RL w/ ACT
Backbone=Qwen3-8B
2026.03
50.34
IL w/ ACT
Backbone=Qwen3-8B
2026.03
48.69
Early Experience (Self-Reflection)
Backbone=Qwen3-8B
2026.03
45.6
RL
Backbone=Qwen3-8B
2026.03
43.04
Imitation Learning
Backbone=Qwen3-8B
2026.03
42.8
Prompt w/o CoT thinking
Backbone=Qwen3-8B, CoT...
2026.03
28.01
ACT
Backbone=Qwen3-8B
2026.03
26.71
Prompt w/ CoT thinking
Backbone=Qwen3-8B, CoT...
2026.03
25.21
Feedback
Search any
task
Search any
task