Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Evaluation Dataset (20 agents x 2 requirement types)

0.68Time (min)

LLM-Singleturn

0.30722.82365.347.8564May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
0.6850.26
2026.05
1.3649.99
2026.05
3.982,872.07
2026.05
4.03869.36
2026.05
4.182,094.98
2026.05
4.391,698.44
2026.05
4.542,196.16
2026.05
5.33,579.46
2026.05
6.724,049.63
2026.05
103,023.59