Share your thoughts, 1 month free Claude Pro on usSee more

Chain-of-reasoning on Loong Set 1: 10K–50K Tokens

70.31LLM Score

Llama-3.3-70B

Updated 5mo ago

Evaluation Results

Method	Links
Llama-3.3-70B 2026.01		70.31	37
Disco-RAG 2026.01		68.3	38
Disco-RAG 2026.01		68	34
StructRAG 2026.01		67.84	34
Disco-RAG 2026.01		67.73	35
Qwen2.5-72B 2026.01		66.51	36
Llama-3.3-70B 2026.01		66.48	36
Llama-3.1-8B 2026.01		65.66	37
Qwen2.5-72B 2026.01		64.67	34
RQ-RAG 2026.01		58.96	25
Llama-3.1-8B 2026.01		58.76	32
GraphRAG 2026.01		54.29	43