Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Overall on Loong Set 4: 200K–250K Tokens
Loading...
54.62
LLM Score
Disco-RAG
22.224
30.6345
39.045
47.4555
Jan 7, 2026
LLM Score
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
EM
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
54.62
11
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
54.47
10
StructRAG
Condition=SOTA Results
2026.01
51.42
10
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
50.87
8
Llama-3.3-70B
Condition=Standard RAG
2026.01
35.61
7
Qwen2.5-72B
Condition=Standard RAG
2026.01
33.64
4
Llama-3.1-8B
Condition=Standard RAG
2026.01
33.52
2
Llama-3.3-70B
Condition=Full Context
2026.01
32.21
5
RQ-RAG
Condition=SOTA Results
2026.01
31.91
1
Qwen2.5-72B
Condition=Full Context
2026.01
30.15
4
Llama-3.1-8B
Condition=Full Context
2026.01
27.82
4
GraphRAG
Condition=SOTA Results
2026.01
23.47
5
Feedback
Search any
task
Search any
task