Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning failure prediction on CodeLingua (L3)

76Accuracy

thought-tree-based classifier

64.5667.5370.573.47Apr 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
76
2026.04
65