Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual puzzle solving on Jigsaw R1 (test)

61.9Accuracy (2x1)

GPT-4.1-mini

44.2248.8153.457.99Aug 7, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
61.954.554.820.347.88
2025.08
51.2550.0552.059.640.73
2025.08
51.1561.4659.1518.6447.6
2025.08
50505012.540.63
2025.08
49.448.750.615.841.12
2025.08
44.941.948.69.736.28