Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multilingual Evaluation on In-House Multilingual Datasets SEA Average 1.0

90.15en Score

GPT-5-Thinking

85.792486.923788.05589.1863Dec 8, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
90.1589.5888.0887.5187.6785.9683.9786.64
89.3988.8486.6288.2889.1386.5585.7687.27
88.7388.4486.7687.3187.4684.8383.2685.92
2025.12
88.5188.1185.3186.6187.8282.1284.0985.19
2025.12
88.0785.6487.8986.5686.8486.0484.7586.41
87.9683.0888.7184.6384.5485.6684.3885.58
2025.12
87.3188.3280.0286.1785.8485.0780.9983.62
85.9686.3885.0883.3984.0283.7384.3484.11