Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning on BOOTCAMP-EVAL

60.2Graphical Puzzles

Qwen2.5-32B-Instruct

11.52824.16436.849.436Aug 12, 2025
Updated 13d ago

Evaluation Results

MethodLinks
2025.08
60.255.461.55660.387.756.669.361.1
2025.08
52.461.364.147.563.787.544.477.359.5
49.555.854.55456.981.349.156.354.5
47.246.650.961.15775.745.867.751
2025.08
45.953.753.745.856.481.551.539.752.1
2025.08
45.253.856.737.660.472.542.344.751.4
2025.08
41.861.370.436.770.680.845.15255.7
3357.450.436.250.775.246.83746.5
31.520.349.825.432.85227.72531.5
2025.08
25.760.167.418.536.859.548.47246.9
13.430.443.312.410.924.229.91024.4