Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tool-augmented agent execution on MCP-Bench

46.8Task Fulfillment

ReAct

30.78434.94239.143.258May 9, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
46.836.641.531.240.519.877.7041.736.430.138.936.8
2026.05
38.538.742.547.836.920.610075.438.645.228.887.750
2026.05
34.547.555.358.433.327.510079.94156.930.489.954.6
2026.05
33.742.645.240.429.626.810085.838.242.828.292.950.5
2026.05
32.85159.66133.630.910082.841.960.332.291.456.5
2026.05
31.626.633.648.420.520.210010029.14120.310047.6
2026.05
31.447.253.165.927.129.910078.439.359.528.589.254.1