Share your thoughts, 1 month free Claude Pro on usSee more

Long Context Reasoning on AA-LCR

70.2Score

Kimi-K2.6

Updated 1mo ago

Evaluation Results

Method	Links
Kimi-K2.6 2026.06		70.2
MiniMax-2.7 2026.06		69.8
Qwen-3.5 2026.06		68.3
DS-v4-Pro 2026.06		67.3
GLM-5.1 2026.06		66.9
Nemotron 3 Ultra 2026.06		65.4
DS-v4-Flash 2026.06		62.7
LoongRL 2026.05		53.5
LONGTRACERL 2026.05		53.5
DocQA 2026.05		50.2
LongRLVR 2026.05		48.5
gpt-oss-120b 2026.01		48.3
LONGTRACERL-GRPO 2026.05		48.2
Base 2026.05		47
K-EXAONE 2026.01		45.2
gpt-oss-120b 2026.01		45
LONGTRACERL 2026.05		41.8
LongRLVR 2026.05		37.5
GLM-4.5-Air 2026.01		37.3
Solar Open 2026.01		35
LONGTRACERL-GRPO 2026.05		34
Base 2026.05		33.2
DeepSeek-V3.2 2026.01		32
LoongRL 2026.05		32
Qwen3-235B-A22B Instruct-2507 2026.01		31.2
DocQA 2026.05		28.8
Qwen3.5-2B 2026.06		25.6
Cosmos3-Edge 2026.06		22.8
LONGTRACERL-GRPO 2026.05		15
LONGTRACERL 2026.05		15
Base 2026.05		13.8
LongRLVR 2026.05		12.2
LoongRL 2026.05		10.2
DocQA 2026.05		9.5
EXAONE 4.0 2026.01		8