Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Artifact Explanation on ArtiBench (test)
Loading...
23.3
ROUGE
Qwen2.5-VL-7B + ArtiAgent
11.236
14.368
17.5
20.632
Feb 24, 2026
ROUGE
CSS
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE
CSS
Qwen2.5-VL-7B + ArtiAgent
Fine-tuning=100K train...
2026.02
23.3
64.3
InternVL3.5-8B + ArtiAgent
Fine-tuning=100K train...
2026.02
22.6
62.5
Gemini-2.5-Pro
2026.02
15.9
42
GPT-5
2026.02
14.5
43.4
LEGION
Training Split=SynthSc...
2026.02
14.3
33.2
GPT-4o
2026.02
14.3
43.3
InternVL3.5-8B
Fine-tuning=Vanilla
2026.02
12.6
25.6
Qwen2.5-VL-7B
Fine-tuning=Vanilla
2026.02
11.7
26.3
Feedback
Search any
task
Search any
task