Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context Question Answering on LongBench Pro
Loading...
34.2
F1 Score
GLM-4.1V-9B-Thinking VERA
25.9736
28.1093
30.245
32.3807
Feb 9, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
GLM-4.1V-9B-Thinking VERA
RAG Strategy=Attention...
2026.02
34.2
GLM-4.1V-9B-Thinking
RAG Strategy=Direct (N...
2026.02
31.06
Glyph
RAG Strategy=Direct (N...
2026.02
28.94
GLM-4.1V-9B-Thinking Random RAG
RAG Strategy=Random RAG
2026.02
28.84
Qwen3-VL-8B-Instruct VERA
RAG Strategy=Attention...
2026.02
28.74
GLM-4.1V-9B-Thinking ColPali RAG
RAG Strategy=ColPali RAG
2026.02
28.58
Qwen3-VL-8B-Instruct Random RAG
RAG Strategy=Random RAG
2026.02
28
Qwen3-VL-8B-Instruct
RAG Strategy=Direct (N...
2026.02
27.56
Qwen3-VL-8B-Instruct OCR RAG
RAG Strategy=OCR RAG
2026.02
26.4
GLM-4.1V-9B-Thinking Embedding RAG
RAG Strategy=Embedding...
2026.02
26.29
Feedback
Search any
task
Search any
task