Share your thoughts, 1 month free Claude Pro on usSee more

General Task (Agentic Coding) on tau2-Bench Telecom

98.2Score

GLM-5

Updated 3mo ago

Evaluation Results

Method	Links
GLM-5 2026.03		98.2
Gemini 3.1 Pro 2026.03		95.6
KAT-Coder-V2 2026.03		93.9
Claude Opus 4.6 2026.03		92.1
GPT-5.4 2026.03		91.5
MiniMax M2.7 2026.03		84.8