Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Scientific Research on FrontierScience Research track
Loading...
53.3
Score (%)
EvoMaster
16.9
26.35
35.8
45.25
Apr 19, 2026
Score (%)
Relative Improvement
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score (%)
Relative Improvement
EvoMaster
Backend Model=GPT-5.4,...
2026.04
53.3
191
OpenClaw
Backend Model=GPT-5.4,...
2026.04
18.3
-
Feedback
Search any
task
Search any
task