Share your thoughts, 1 month free Claude Pro on usSee more

Long-context language modeling on LongBench V2 (test)

60Acc (Short)

Qwen3-32B

Updated 3mo ago

Evaluation Results

Method	Links
Qwen3-32B 2026.03		60	41.1	53.1	46.8	49.2	-	47.2
CSAttention 2026.03		57	40	51.1	46.2	48.1	-	49.4
Llama-70B 2026.03		46	33	38.6	35.2	36.5	-	27.6
CSAttention 2026.03		44	35.4	39.1	34.8	36.4	-	25.9
TRIM-KV 2025.12		35.39	20.93	34.44	28.74	30.68	6.56	-
Full KV 2025.12		33.71	18.6	34.44	25.86	28.79	0	-
LocRet 2025.12		32.02	19.78	26.67	28.74	28.03	-2.64	-