Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning on LiveCodeBench (pass@1 accuracy)

55.45pass@1 Accuracy

Baseline

33.724439.364745.00550.6453Oct 1, 2025
Updated 23d ago

Evaluation Results

MethodLinks
2025.10
55.45
2025.10
50.47
2025.10
47.9
2025.10
46.68
2025.10
45.84
2025.10
41.97
2025.10
40.75
2025.10
34.56