Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Model Evaluation on Open LLM Leaderboard v2 (test)

60.84BBH

Qwen3-8B

27.674436.284744.89553.5053Feb 16, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
60.8436.3339.2152.4947.6243.12-
2026.02
50.2231.9633.4512.7638.5541.8-
2026.02
49.3330.5430.466.7235.9245.11-
2026.02
44.8728.9431.892.4932.8540.34-
2026.02
41.4528.8626.741.5129.4640.61-
2026.02
41.0328.2725.661.0626.339.81-
2026.02
37.5128.126.140.9826.4438.89-
2026.02
36.8528.2727.5812.6924.2631.35-
2026.02
34.325.5922.661.0618.7341.4-
2026.02
34.1324.7523.141.0621.2341.27-
2026.02
30.8323.4925.060.2310.8532.8-
2026.02
30.2422.9922.420.8311.5437.57-
2026.02
30.1925.3425.90.9114.8133.2-
2026.02
29.9926.5927.70.6811.4335.71-
2026.02
29.7325.524.34011.2233.73-
2026.02
29.2125.8424.580.9810.8536.51-
2026.02
29.227.9427.221.0612.8135.19-
2026.02
29.0623.7426.620.3810.6633.73-
2026.02
29.0124.3322.3010.9836.24-
2026.02
28.9524.7525.3011.2935.58-
2025.11
------33.99
2025.11
------36.35
2025.11
------35.66
2025.11
------35.54
2025.11
------34.3
2025.11
------34.41
2025.11
------36.9
2025.11
------37.45
2025.11
------37.63