Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agentic Reasoning and Interaction on AgentGym

88ALF Success Rate

AgentEvol-7B

-1.4421.784568.22Feb 9, 2026
Updated 3mo ago

Evaluation Results

MethodLinks
2026.02
88146411.83818.982.74.31213.8125.7255.9603.260.813
2026.02
86.511738.985.512.594.43.8689.7405.4355.1754819.5
2026.02
84.5-72-42-81-48-20-60-90-65.4-
2026.02
7317.8019.42.828.50.57.6813.906106.6510.723.118.7
2026.02
7117.7419.41.628.50.57.51213.94208.3511.722.718.6
2026.02
67.518.3779.914.418.172.99.1689884806954.555.914.2
2026.02
6718.5418.810.728.20.76.3813.945.206.6011.62418.6
2026.02
5120.42315.116.820.745.711.7414.5245.2706.1705.934.516.9
2026.02
1327.93814.62.828.779.36.6015365.2656.9805.126.320.8
2026.02
3.519.6016.50.821.30.110.9013.4060100121.317.3
2026.02
222.6014.50.827.50.29.50150609.90120.919.5