Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Tool-Use Planning on ThinkGeo 14 Tools (test)

77.7Tool Selection Accuracy (TSA)

Qwen3-235B

47.12455.0626370.938May 6, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2026.05
77.774.35530.788.676.762.5
2026.05
74.374.151.924.676.965.950.6
2026.05
727249.721.972.762.539.5
2026.05
52.952.825.94084.173.150.9
2026.05
51.151.134.426.371.859.28
2026.05
50.250.134.819.976.960.614.8
2026.05
48.448.429.353.570.957.139.1
2026.05
48.345.539.22984.569.134.8