Share your thoughts, 1 month free Claude Pro on usSee more

Long-context evaluation on LongBench v2

59.76Overall Score

Qwen3-235B-A22B-Thinking

Updated 2mo ago

Evaluation Results

Method	Links
Qwen3-235B-A22B-Thinking 2026.05		59.76	-	-	-	-	-
Qwen3-30B-A3B-Thinking + ACC 2026.05		48.9	-	-	-	-	-
Qwen3-30B-A3B-Thinking 2026.05		47.87	-	-	-	-	-
CSAttention 2026.03		31.2	34.4	29.3	37.8	25.1	32.4
Full 2026.03		31	35.4	28.3	37.2	26	30.6
H2O 2026.03		29.9	32.9	28	33.8	27.9	31.5
PQCache 2026.03		29.8	33.3	27.7	37.8	22.3	31.5
MagicPig 2026.03		29.2	29.5	29	31.8	26.9	29.4
SparQ 2026.03		26.2	27.6	25.4	30	22.3	27.8