Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Video-grounded understanding and actionable reasoning on Pause-and-think-B 1.0 (test)

64.24Overall Score

GPT-5.2

23.014433.717244.4255.1228May 30, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
64.2455.1386.52
2026.05
59.2752.6374.19
2026.05
58.8953.0372.26
2026.05
5855.0264.84
2026.05
56.3351.6767.03
2026.05
55.3949.0569.85
2026.05
54.6755.7752.17
2026.05
53.1745.2471.68
2026.05
52.6752.3853.33
2026.05
52.3350.9555.56
2026.05
5250.4855.56
2026.05
51.335054.44
2026.05
50.6743.8166.67
2026.05
50.3342.5868.13
2026.05
4946.8653.76
2026.05
45.6746.1944.44
2026.05
44.6748.136.67
2026.05
39.3336.3646.15
2026.05
36.1131.8446.58
2026.05
24.618.9938.36