Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text-based reasoning on ScienceWorld
Loading...
88
Running Max Return
ICRL Preset
68.24
73.37
78.5
83.63
May 21, 2025
Running Max Return
Updated 23d ago
Evaluation Results
Method
Method
Links
Running Max Return
ICRL Preset
configuration=Preset
2025.05
88
ICRL Autonomous
configuration=Autonomous
2025.05
87
Self-Refine
2025.05
83
Best-of-N
2025.05
75
Reflexion
2025.05
74
ReAct
2025.05
69
Feedback
Search any
task
Search any
task