Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Reasoning on TRACE (test)

0.346BLEU-1

LTE-32B

0.087040.154270.22150.28873Oct 5, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.10
0.3460.2110.327
2025.10
0.3140.2090.291
2025.10
0.3030.1820.158
2025.10
0.2990.1830.147
2025.10
0.2940.1530.315
2025.10
0.2820.1490.295
2025.10
0.280.1460.301
2025.10
0.2760.1640.252
2025.10
0.2750.1360.269
2025.10
0.1870.0950.153
2025.10
0.1590.0760.13
2025.10
0.1560.0420.129
2025.10
0.110.0320.073
2025.10
0.1050.0430.109
2025.10
0.1020.0430.079
2025.10
0.1020.0580.054
2025.10
0.0970.0560.067