Share your thoughts, 1 month free Claude Pro on usSee more

Long-context capability evaluation on RULER 4096 length

94.08Accuracy

Qwen3-30B A3B-Instruct

Updated 4mo ago

Evaluation Results

Method	Links
Qwen3-30B A3B-Instruct 2026.02		94.08
Qwen3-4B 2026.02		93.32
QUOKA 2026.02		93.25
QUOKA 2026.02		92.5
Smollm3 2026.02		91.12
QUOKA 2026.02		89.6
Qwen2.5-3B 2026.02		89.56
Llama3.2-3B 2026.02		87.5
QUOKA 2026.02		86.94
QUOKA 2026.02		86.07
GPT-OSS-20B 2026.02		79.35
QUOKA 2026.02		77.4