Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on LiveBench

53.47Accuracy

PerSyn

14.085224.310134.53544.7599Oct 7, 2025Oct 8, 2025Oct 9, 2025Oct 10, 2025Oct 11, 2025Oct 12, 2025Oct 13, 2025
Updated 14d ago

Evaluation Results

MethodLinks
2025.10
53.47
2025.10
52.06
2025.10
51.9
2025.10
50.59
2025.10
50
2025.10
49.08
2025.10
48.85
2025.10
47.5
2025.10
47.5
2025.10
47.33
2025.10
46.9
2025.10
46.34
2025.10
46.2
2025.10
46.2
2025.10
46.2
2025.10
46.2
2025.10
46.15
2025.10
45.6
2025.10
45
2025.10
45
2025.10
44.26
2025.10
43.02
2025.10
42.5
2025.10
41.9
2025.10
41.9
2025.10
39.4
2025.10
38.8
2025.10
38.1
2025.10
38.1
2025.10
37.84
2025.10
37.5
2025.10
37.5
2025.10
37.13
2025.10
36.9
2025.10
36.63
2025.10
36.09
2025.10
35.86
2025.10
35.42
2025.10
34.63
2025.10
33.8
2025.10
33.49
2025.10
33.25
2025.10
33.1
2025.10
32.87
2025.10
31.2
2025.10
30.6
2025.10
30.6
2025.10
29.4
2025.10
28.1
2025.10
15.6