Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Complex Reasoning on TOMATO

38.1Accuracy

Qwen3-VL-8B + SynRL

8.9816.5424.131.66Mar 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
38.1
2026.03
36.7
2026.03
33.2
2026.03
32.1
2026.03
30.1
2026.03
25.5
2026.03
25.3
2026.03
25.1
2026.03
10.1