Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Agent Execution on HiMA-Ecom (test)

98Math Score

DeepSeek-R1

63.6872.5981.590.41Jun 24, 2025
Updated 17d ago

Evaluation Results

MethodLinks
2025.06
9834.719.372.45.345.9
2025.06
95.732.36.377.2342.9
2025.06
84.33550.376.2550.2
2025.06
8030.30.343231.2
2025.06
7134.2368.91.735.8
2025.06
7020.748.373.9744
2025.06
6519.339.363.5538.4