Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agentic-Wide

Benchmarks

Task NameDataset NameSOTA ResultTrend
Agentic SearchAgentic-Wide TaskCraft (test)
Accuracy90.1
4
Agentic SearchAgentic-Wide WebWalkerQA (test)
Accuracy62.5
4
Agentic SearchAgentic-Wide XBench (test)
Accuracy74.19
4
Showing 3 of 3 rows