Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Comparison on Loong Set 1: 10K–50K Tokens
Loading...
75.65
LLM Score
Disco-RAG
25.678
38.6515
51.625
64.5985
Jan 7, 2026
LLM Score
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
EM
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
75.65
45
StructRAG
Condition=SOTA Results
2026.01
75.58
47
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
74.39
41
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
73.57
37
Llama-3.3-70B
Condition=Standard RAG
2026.01
65.32
39
Llama-3.3-70B
Condition=Full Context
2026.01
61.33
35
Qwen2.5-72B
Condition=Standard RAG
2026.01
61.29
35
Llama-3.1-8B
Condition=Standard RAG
2026.01
60.61
26
Qwen2.5-72B
Condition=Full Context
2026.01
57.21
33
Llama-3.1-8B
Condition=Full Context
2026.01
56.06
36
RQ-RAG
Condition=SOTA Results
2026.01
48.16
5
GraphRAG
Condition=SOTA Results
2026.01
27.6
0
Feedback
Search any
task
Search any
task