Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Comparison on Loong Set 2: 50K–100K Tokens
Loading...
64.34
LLM Score
Disco-RAG
12.288
25.8015
39.315
52.8285
Jan 7, 2026
LLM Score
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
EM
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
64.34
36
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
64.06
30
StructRAG
Condition=SOTA Results
2026.01
63.71
36
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
63.56
24
Llama-3.3-70B
Condition=Standard RAG
2026.01
53.37
22
RQ-RAG
Condition=SOTA Results
2026.01
50.83
16
Qwen2.5-72B
Condition=Standard RAG
2026.01
50.64
20
Llama-3.3-70B
Condition=Full Context
2026.01
47.93
26
Llama-3.1-8B
Condition=Standard RAG
2026.01
45.42
19
Qwen2.5-72B
Condition=Full Context
2026.01
44.47
25
Llama-3.1-8B
Condition=Full Context
2026.01
42.37
21
GraphRAG
Condition=SOTA Results
2026.01
14.29
0
Feedback
Search any
task
Search any
task