Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on PubMedQA Retrieval OS Setting (test)
Loading...
76.4
Accuracy
OS-8B
44.6904
52.9227
61.155
69.3873
Apr 2, 2026
Accuracy
F1 Score
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
OS-8B
2026.04
76.4
-
LoRA FT 2ep 10k context
Fine-Tuning=LoRA, Epoc...
2026.04
58.01
55.83
Full FT 5ep 16k context
Fine-Tuning=Full, Epoc...
2026.04
58.01
56.21
Llama 3.2 3B Instruct
Model=Llama 3.2, Param...
2026.04
49.11
49.04
Llama 3.1 8B Instruct
Model=Llama 3.1, Param...
2026.04
45.91
43.35
Feedback
Search any
task
Search any
task