Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long Context Understanding on MRCR
Loading...
75.3
Accuracy
Gemini-3.0-pro
25.796
38.648
51.5
64.352
Dec 30, 2025
Jan 12, 2026
Jan 26, 2026
Feb 9, 2026
Feb 23, 2026
Mar 9, 2026
Mar 23, 2026
Accuracy
Updated 25d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini-3.0-pro
2026.03
75.3
LongCat-Flash Exp-Chat
Evaluation Mode=Chat
2025.12
59.7
Deepseek-v3.1
2026.03
46.62
Qwen3-32B + TableLong
2026.03
42.66
Qwen3-32B
2026.03
42.45
GLM 4.6
Evaluation Mode=Chat
2025.12
42.1
Deepseek-R1-Distill-Qwen-32B + TableLong
2026.03
40.57
DeepSeek V3.2
Evaluation Mode=Chat
2025.12
37.1
LongCat-Flash Chat
Evaluation Mode=Chat
2025.12
34.4
Qwen2.5-32B-Instruct
2026.03
33.19
Qwen2.5-32B-Instruct + TableLong
2026.03
33.19
Deepseek-R1-Distill-Qwen-32B
2026.03
31.94
Deepseek-R1-Distill-Qwen-14B + TableLong
2026.03
30.48
Deepseek-R1-Distill-Qwen-14B
2026.03
29.02
Qwen-Long-L1
2026.03
27.7
Feedback
Search any
task
Search any
task