Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on PopQA (Pattern-based, Model Averages)
Loading...
71.26
Pattern-based Score
CoRAG
56.0968
60.0334
63.97
67.9066
Feb 21, 2026
Pattern-based Score
Llama Score
GPT Score
DeepSeek Score
Qwen Score
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pattern-based Score
Llama Score
GPT Score
DeepSeek Score
Qwen Score
Average Score
CoRAG
backbone=Llama (fine-t...
2026.02
71.26
89.21
65.9
64.26
66.12
71.35
InstructRAG
training_dataset=Pop
2026.02
66.19
75.63
60.83
59.69
60.69
64.6
RetRobust
2026.02
56.68
25.52
54.9
34.1
57.26
45.69
Feedback
Search any
task
Search any
task