Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-Context Reasoning on BrowsCompLong
Loading...
88.07
Accuracy
Gemini-3.0-pro
49.9124
59.8187
69.725
79.6313
Mar 23, 2026
Accuracy
Updated 25d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini-3.0-pro
2026.03
88.07
Deepseek-R1-Distill-Qwen-14B + TableLong
2026.03
74.31
Deepseek-R1-Distill-Qwen-32B + TableLong
2026.03
74.31
Qwen-Long-L1
2026.03
69.93
Qwen3-32B + TableLong
2026.03
65.44
Deepseek-R1-Distill-Qwen-32B
2026.03
64.22
Qwen2.5-32B-Instruct + TableLong
2026.03
60.55
Qwen3-32B
2026.03
59.33
Deepseek-v3.1
2026.03
56.27
Qwen2.5-32B-Instruct
2026.03
52.56
Deepseek-R1-Distill-Qwen-14B
2026.03
51.38
Feedback
Search any
task
Search any
task