Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Reasoning on ScienceWorld Unseen

62Average Reward

Co-Evolving Agents

8.4422.34536.2550.155Nov 27, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.11
62
2025.11
58.5
2025.11
55.5
2025.11
55.2
2025.11
54.3
2025.11
51.7
2025.11
41.9
2025.11
40.8
2025.11
38.1
2025.11
10.5