Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Comparison on Loong Set 4: 200K–250K Tokens
Loading...
55.8
LLM Score
Disco-RAG
24.1528
32.3689
40.585
48.8011
Jan 7, 2026
LLM Score
Exact Match
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
Exact Match
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
55.8
17
StructRAG
Condition=SOTA Results
2026.01
55.62
25
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
54.97
15
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
53.92
12
RQ-RAG
Condition=SOTA Results
2026.01
40.36
0
Llama-3.3-70B
Condition=Standard RAG
2026.01
34.49
2
Qwen2.5-72B
Condition=Standard RAG
2026.01
32.31
1
Llama-3.3-70B
Condition=Full Context
2026.01
32.22
7
Llama-3.1-8B
Condition=Standard RAG
2026.01
31.9
0
Qwen2.5-72B
Condition=Full Context
2026.01
28.23
6
GraphRAG
Condition=SOTA Results
2026.01
26.67
0
Llama-3.1-8B
Condition=Full Context
2026.01
25.37
6
Feedback
Search any
task
Search any
task