Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Closed book Question Answering on MultiSpanQA (test)
Loading...
48
F1 Score
CoVe (factored)
15.76
24.13
32.5
40.87
Sep 20, 2023
F1 Score
Precision
Recall
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Precision
Recall
CoVe (factored)
LLM=Llama 65B, Strateg...
2023.09
48
50
46
CoVe (joint)
LLM=Llama 65B, Strateg...
2023.09
46
50
42
Llama 65B (Few-shot)
LLM=Llama 65B, Strateg...
2023.09
39
40
38
Llama 2 70B Chat (Zero-shot)
LLM=Llama 2 70B Chat,...
2023.09
20
13
40
Llama 2 70B Chat (CoT)
LLM=Llama 2 70B Chat,...
2023.09
17
11
37
Feedback
Search any
task
Search any
task