Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Artifact Explanation on SynthScars (test)
Loading...
24.7
ROUGE Score
LEGION
4.212
9.531
14.85
20.169
Feb 24, 2026
ROUGE Score
CSS Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
ROUGE Score
CSS Score
LEGION
Training Split=SynthSc...
2026.02
24.7
58.9
Qwen2.5-VL-7B + ArtiAgent
Fine-tuning=100K train...
2026.02
19.6
57.8
InternVL3.5-8B + ArtiAgent
Fine-tuning=100K train...
2026.02
17.9
51.3
GPT-4o
2026.02
12.5
40.4
GPT-5
2026.02
12
46.1
Qwen2.5-VL-7B
Fine-tuning=Vanilla
2026.02
11.5
36.2
Gemini-2.5-Pro
2026.02
10.3
47.4
InternVL3.5-8B
Fine-tuning=Vanilla
2026.02
5
18
Feedback
Search any
task
Search any
task