Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Overall on Loong Set 2: 50K–100K Tokens
Loading...
0.6361
LLM Score
Disco-RAG
0.31838
0.400865
0.48335
0.565835
Jan 7, 2026
LLM Score
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
EM
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
0.6361
28
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
0.6132
25
StructRAG
Condition=SOTA Results
2026.01
0.6095
24
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
0.5903
23
Llama-3.3-70B
Condition=Standard RAG
2026.01
0.5377
18
Qwen2.5-72B
Condition=Standard RAG
2026.01
0.5033
17
Llama-3.1-8B
Condition=Standard RAG
2026.01
0.4912
16
Llama-3.3-70B
Condition=Full Context
2026.01
0.4824
17
RQ-RAG
Condition=SOTA Results
2026.01
0.4709
10
Qwen2.5-72B
Condition=Full Context
2026.01
0.4661
13
Llama-3.1-8B
Condition=Full Context
2026.01
0.4378
14
GraphRAG
Condition=SOTA Results
2026.01
0.3306
3
Feedback
Search any
task
Search any
task