Share your thoughts, 1 month free Claude Pro on usSee more

Long-Context Understanding on Long-Context Understanding

66.8Score

GPT-5

Updated 5mo ago

Evaluation Results

Method	Links
GPT-5 2026.01		66.8	0.5
GPT-5 2026.01		66.3	-
DeepSeek-V3.2-Thinking 2026.01		63.8	3
Claude-Sonnet-4.5-Think 2026.01		61.8	1.3
Kimi-K2-Thinking 2026.01		61.8	1.5
DeepSeek-V3.2-Thinking 2026.01		60.8	-
Claude-Sonnet-4.5-Think 2026.01		60.5	-
Kimi-K2-Thinking 2026.01		60.3	-
MiniMax-M2 2026.01		58.5	6.2
MiniMax-M2 2026.01		52.3	-
Qwen3-Coder-30B-A3B 2026.01		24.6	-
Qwen3-Coder-30B-A3B 2026.01		23.9	-0.7
Qwen3-4B-Instruct-2507 2026.01		10.5	1.9
Qwen3-4B-Instruct-2507 2026.01		8.6	-