Share your thoughts, 1 month free Claude Pro on usSee more

Cross-Lingual Planning on ACEBench

78.3Score (En)

MagicAgent-32B

Updated 5mo ago

Evaluation Results

Method	Links
MagicAgent-32B 2026.02		78.3	84.1
GLM-4.7 2026.02		78.2	87
Qwen3-MAX 2026.02		77.6	86.6
Kimi-K2-Instruct 2026.02		77.4	83.3
MagicAgent-30B-A3B 2026.02		74.9	83.5
GPT-5.2 2026.02		70.7	80.1
Qwen3-235B-A22B-Instruct 2026.02		70.1	77.1
DeepSeek-V3.1-nothink 2026.02		67.8	73.2
Qwen3-30B-A3B-Instruct 2026.02		61.3	71.1
Qwen3-32B-nothink 2026.02		57.2	70.1
Llama3.3-70B-Instruct 2026.02		56.7	61.7
GLM-4.7-Flash 2026.02		53.5	62
ERNIE-4.5-21B-A3B-PT 2026.02		46.9	51.1
Olmo-3.1-32B-Instruct 2026.02		25.1	18.4