Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Overall on Loong Set 3: 100K–200K Tokens
Loading...
58.86
LLM Score
Disco-RAG
32.2568
39.1634
46.07
52.9766
Jan 7, 2026
LLM Score
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
EM
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
58.86
22
StructRAG
Condition=SOTA Results
2026.01
57.92
21
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
57.14
18
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
56.64
15
Llama-3.3-70B
Condition=Standard RAG
2026.01
45.77
13
Qwen2.5-72B
Condition=Standard RAG
2026.01
44.38
11
Llama-3.1-8B
Condition=Standard RAG
2026.01
43.42
6
Llama-3.3-70B
Condition=Full Context
2026.01
42.27
11
Qwen2.5-72B
Condition=Full Context
2026.01
42.01
10
RQ-RAG
Condition=SOTA Results
2026.01
40.93
5
Llama-3.1-8B
Condition=Full Context
2026.01
36.51
8
GraphRAG
Condition=SOTA Results
2026.01
33.28
4
Feedback
Search any
task
Search any
task