Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on PubMedQA (Acc, TNFT, Tokens, Delay)
Loading...
73.6
Accuracy
Vanilla
69.648
70.674
71.7
72.726
May 12, 2026
Accuracy
TNFT
Token Count
Delay
Updated 21d ago
Evaluation Results
Method
Method
Links
Accuracy
TNFT
Token Count
Delay
Vanilla
Backbone=Qwen3-4B
2026.05
73.6
404.67
1,420.14
21.73
Ours
Backbone=Qwen3-4B
2026.05
73.5
0
648.96
13.8
Base
Backbone=Qwen3-4B
2026.05
72.9
389.6
1,323.75
20.98
Ours
Backbone=Qwen3-1.7B
2026.05
70.5
0
658.01
10.96
Vanilla
Backbone=Qwen3-1.7B
2026.05
70.4
404.67
1,422.91
15.6
Base
Backbone=Qwen3-1.7B
2026.05
69.8
389.6
1,263.91
14.2
Feedback
Search any
task
Search any
task