Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Legal Reasoning on LegalArg
Loading...
65.42
Accuracy
Qwen3-4B-Instruct
46.5128
51.4214
56.33
61.2386
Mar 18, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen3-4B-Instruct
Model Backbone=Qwen3-4...
2026.03
65.42
Qwen3-8B-Arguinas-Target-SFT
Model Backbone=Qwen3-8...
2026.03
62.87
Qwen3-4B-Instruct-Arguinas-SFT
Model Backbone=Qwen3-4...
2026.03
61.75
Qwen2.5-7B-Instruct-Arguinas-SFT
Model Backbone=Qwen2.5...
2026.03
61.67
Qwen3-8B-AAAC-Target-SFT
Model Backbone=Qwen3-8...
2026.03
61.63
Qwen3-8B-EntailmentBank-Target-SFT
Model Backbone=Qwen3-8...
2026.03
61.24
Qwen2.5-7B-Instruct
Model Backbone=Qwen2.5...
2026.03
60.19
Qwen3-4B-Arguinas-Target-SFT
Model Backbone=Qwen3-4...
2026.03
58.52
Qwen3-8B-ArgumentOnly-Target-SFT
Model Backbone=Qwen3-8...
2026.03
57.89
Qwen3-4B-AAAC-Target-SFT
Model Backbone=Qwen3-4...
2026.03
56.06
Qwen3-4B-EntailmentBank-Target-SFT
Model Backbone=Qwen3-4...
2026.03
55.53
Qwen3-8B-Target-SFT
Model Backbone=Qwen3-8...
2026.03
54.7
Qwen3-4B-ArgumentOnly-Target-SFT
Model Backbone=Qwen3-4...
2026.03
48.6
Qwen3-4B-Target-SFT
Model Backbone=Qwen3-4...
2026.03
47.24
Feedback
Search any
task
Search any
task