Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context Question Answering on DocMath
Loading...
29.02
F1 Score
GLM-4.1V-9B-Thinking VERA
2.448
9.3465
16.245
23.1435
Feb 9, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
GLM-4.1V-9B-Thinking VERA
RAG Strategy=Attention...
2026.02
29.02
GLM-4.1V-9B-Thinking Random RAG
RAG Strategy=Random RAG
2026.02
21.99
GLM-4.1V-9B-Thinking ColPali RAG
RAG Strategy=ColPali RAG
2026.02
21.18
GLM-4.1V-9B-Thinking Embedding RAG
RAG Strategy=Embedding...
2026.02
17.49
GLM-4.1V-9B-Thinking
RAG Strategy=Direct (N...
2026.02
15.71
Glyph
RAG Strategy=Direct (N...
2026.02
13.61
Qwen3-VL-8B-Instruct VERA
RAG Strategy=Attention...
2026.02
9.45
Qwen3-VL-8B-Instruct Random RAG
RAG Strategy=Random RAG
2026.02
5.61
Qwen3-VL-8B-Instruct OCR RAG
RAG Strategy=OCR RAG
2026.02
4.72
Qwen3-VL-8B-Instruct
RAG Strategy=Direct (N...
2026.02
3.47
Feedback
Search any
task
Search any
task