Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Artifact Explanation on SynthScars (test)
Loading...
24.7
ROUGE Score
LEGION
4.212
9.531
14.85
20.169
Feb 24, 2026
ROUGE Score
CSS Score
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE Score
CSS Score
LEGION
Training Split=SynthSc...
2026.02
24.7
58.9
Qwen2.5-VL-7B + ArtiAgent
Fine-tuning=100K train...
2026.02
19.6
57.8
InternVL3.5-8B + ArtiAgent
Fine-tuning=100K train...
2026.02
17.9
51.3
GPT-4o
2026.02
12.5
40.4
GPT-5
2026.02
12
46.1
Qwen2.5-VL-7B
Fine-tuning=Vanilla
2026.02
11.5
36.2
Gemini-2.5-Pro
2026.02
10.3
47.4
InternVL3.5-8B
Fine-tuning=Vanilla
2026.02
5
18
Feedback
Search any
task
Search any
task