Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Closed-set Question Answering on PubHealth
Loading...
74.5
Accuracy
SELF-RAG
27.596
39.773
51.95
64.127
Oct 17, 2023
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
SELF-RAG
Scale=13B, Retrieval=Yes
2023.10
74.5
SELF-RAG
Scale=7B, Retrieval=Yes
2023.10
72.4
ChatGPT
Retrieval=No, Propriet...
2023.10
70.1
Llama2-FT
Scale=7B, Retrieval=Ye...
2023.10
64.3
Alpaca
Scale=13B, Retrieval=No
2023.10
55.5
Ret-ChatGPT
Retrieval=Yes, Proprie...
2023.10
54.7
Ret-Llama2-Chat
Scale=13B, Retrieval=Y...
2023.10
52.1
Alpaca
Scale=13B, Retrieval=Yes
2023.10
51.1
Alpaca
Scale=7B, Retrieval=No
2023.10
49.8
Llama2-Chat
Scale=13B, Retrieval=N...
2023.10
49.4
Alpaca
Scale=7B, Retrieval=Yes
2023.10
40.2
Llama2
Scale=7B, Retrieval=No
2023.10
34.2
Llama2
Scale=13B, Retrieval=Yes
2023.10
30.2
Llama2
Scale=7B, Retrieval=Yes
2023.10
30
Llama2
Scale=13B, Retrieval=No
2023.10
29.4
Feedback
Search any
task
Search any
task