Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-Context Understanding on Long-Context Understanding
Loading...
66.8
Score
GPT-5
6.272
21.986
37.7
53.414
Jan 22, 2026
Score
Performance Delta
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Performance Delta
GPT-5
Generation Mode=LLM-in...
2026.01
66.8
0.5
GPT-5
Generation Mode=Standa...
2026.01
66.3
-
DeepSeek-V3.2-Thinking
Generation Mode=LLM-in...
2026.01
63.8
3
Claude-Sonnet-4.5-Think
Generation Mode=LLM-in...
2026.01
61.8
1.3
Kimi-K2-Thinking
Generation Mode=LLM-in...
2026.01
61.8
1.5
DeepSeek-V3.2-Thinking
Generation Mode=Standa...
2026.01
60.8
-
Claude-Sonnet-4.5-Think
Generation Mode=Standa...
2026.01
60.5
-
Kimi-K2-Thinking
Generation Mode=Standa...
2026.01
60.3
-
MiniMax-M2
Generation Mode=LLM-in...
2026.01
58.5
6.2
MiniMax-M2
Generation Mode=Standa...
2026.01
52.3
-
Qwen3-Coder-30B-A3B
Generation Mode=Standa...
2026.01
24.6
-
Qwen3-Coder-30B-A3B
Generation Mode=LLM-in...
2026.01
23.9
-0.7
Qwen3-4B-Instruct-2507
Generation Mode=LLM-in...
2026.01
10.5
1.9
Qwen3-4B-Instruct-2507
Generation Mode=Standa...
2026.01
8.6
-
Feedback
Search any
task
Search any
task