Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Sciworld

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific ReasoningSciWorld
Accuracy95.9
164
Science SimulationSciWorld
Accuracy94.6
41
Embodied AgenticSciWorld
Accuracy88.9
21
Next-state predictionSciWorld (SW)
EM Accuracy98.64
16
Task successSciWorld
Real68.21
14
Scientific ReasoningSciWorld
Success Rate (SR)59.48
14
Science simulationSciworld
Progress Rate82.6
12
Showing 7 of 7 rows