Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Argument Quality Evaluation on WebisArgQuality 20
Loading...
41.08
F1 Score
Qwen3-4B-ArgumentOnly-Target-SFT
21.6008
26.6579
31.715
36.7721
Mar 18, 2026
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
Qwen3-4B-ArgumentOnly-Target-SFT
Model Backbone=Qwen3-4...
2026.03
41.08
Qwen3-4B-Arguinas-Target-SFT
Model Backbone=Qwen3-4...
2026.03
41.01
Qwen3-4B-Target-SFT
Model Backbone=Qwen3-4...
2026.03
39.37
Qwen3-8B-EntailmentBank-Target-SFT
Model Backbone=Qwen3-8...
2026.03
38.96
Qwen3-4B-AAAC-Target-SFT
Model Backbone=Qwen3-4...
2026.03
38.58
Qwen3-4B-EntailmentBank-Target-SFT
Model Backbone=Qwen3-4...
2026.03
38
Qwen3-8B-Arguinas-Target-SFT
Model Backbone=Qwen3-8...
2026.03
37.77
Qwen3-8B-AAAC-Target-SFT
Model Backbone=Qwen3-8...
2026.03
37.67
Qwen3-4B-Instruct-Arguinas-SFT
Model Backbone=Qwen3-4...
2026.03
36.43
Qwen3-8B-ArgumentOnly-Target-SFT
Model Backbone=Qwen3-8...
2026.03
35.27
Qwen3-8B-Target-SFT
Model Backbone=Qwen3-8...
2026.03
34.97
Qwen3-4B-Instruct
Model Backbone=Qwen3-4...
2026.03
34.68
Qwen2.5-7B-Instruct-Arguinas-SFT
Model Backbone=Qwen2.5...
2026.03
33.82
Qwen2.5-7B-Instruct
Model Backbone=Qwen2.5...
2026.03
22.35
Feedback
Search any
task
Search any
task