Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-choice reasoning on LongReason

84.13Accuracy (32k)

QwenLong-L1-32B

48.499657.74986776.2502Mar 2, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
84.1383.6375.0680.94
2026.03
81.2377.9676.0778.42
2026.03
80.8677.0874.5677.5
2026.03
80.3579.4777.8379.22
2026.03
77.2371.2865.9971.5
2026.03
75.4471.7973.4273.55
2026.03
74.2774.5369.4872.76
2026.03
74.1870.0369.2771.16
2026.03
74.06--72.33
2026.03
70.7869.0268.0169.27
2026.03
68.6466.8366.6267.36
2026.03
66.7566.2566.3666.45
2026.03
61.1663.348.357.59
2026.03
51.8951.0156.853.23
2026.03
51.4549.9446.5349.31
2026.03
50.8849.1148.8749.62
2026.03
49.8749.6249.3749.62