Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tool Learning under Instruction with Multiple Requests on NoisyToolBench IMR 1.0 (test)

90A1 Score

DFSDT + AwN

-3.620.74569.3Aug 31, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.08
905236
2024.08
885648
2024.08
864620
2024.08
825248
2024.08
805044
2024.08
804630
2024.08
764438
2024.08
76284
2024.08
765448
2024.08
765854
2024.08
74248
2024.08
725236
2024.08
702826
2024.08
705446
2024.08
624226
2024.08
601816
2024.08
602018
2024.08
6084
2024.08
605422
2024.08
323024
2024.08
323018
2024.08
261816
2024.08
242824
2024.08
22102
2024.08
202412
2024.08
182816
2024.08
122824
2024.08
121818
2024.08
42614
2024.08
22812
2024.08
01614
2024.08
01610