Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SWE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Model Learning from Noisy DataSWE (Shallow Water Equations) system
Full-field Avg Relative Error4.27
18
Software EngineeringSWE Verified
Resolution Rate77.2
17
CodeSWE Verified Agentless
pass@157.6
8
Software Engineering AutomationSWE Multilingual
Resolved70.2
5
Watermark DetectionSWE (test)
Delta Q (Δ̂q)0.71
4
Agent Trajectory PerformanceSWE (test)
Pass@1 Accuracy (%)12.7
4
Historical normalizationswe historical normalization (test)
Accuracy0.579
4
Solution PredictionSWE
Relative L2 Error (Data)2.15
3
Learning PDE DynamicsSWE
Relative L2 Error0.005
2
State RolloutSWE
Metric-
0
Showing 10 of 10 rows