Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-document Question Answering on HotpotQA (Acc, throughput)
Loading...
45
Accuracy
Dense
16.92
24.21
31.5
38.79
Jul 29, 2025
Accuracy
Throughput
Updated 19d ago
Evaluation Results
Method
Method
Links
Accuracy
Throughput
Dense
Model=QwQ 32B
2025.07
45
297.34
Dense
Model=Phi 4 reasoning...
2025.07
43
525.39
ReasonCache
Model=QwQ 32B
2025.07
41
423.23
ReasonCache
Model=Phi 4 reasoning...
2025.07
40
695.88
Dense
Model=DeepSeek R1 Dist...
2025.07
37.5
381.51
ReasonCache
Model=DeepSeek R1 Dist...
2025.07
36
542.43
Random
Model=QwQ 32B
2025.07
22
421.89
Random
Model=Phi 4 reasoning...
2025.07
21
699.87
Random
Model=DeepSeek R1 Dist...
2025.07
18
537.85
Feedback
Search any
task
Search any
task