Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

APT-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
ToolAPT-Bench
Accuracy65.8
6
MathAPT-Bench
Accuracy70.5
6
Deep ResearchAPT-Bench
Accuracy40.5
6
CodeAPT-Bench
Accuracy41.9
6
Showing 4 of 4 rows