Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tool Use Reasoning on Tool use

68Avg Accuracy @16 (1h)

SDPO (on-policy)

38.15245.90153.6561.399Jan 28, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
6868.5
2026.01
64.967.7
2026.01
60.862.1
2026.01
60.265.7
2026.01
57.5-
2026.01
56.860.6
2026.01
56.465
2026.01
39.3-