Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning Quality Assessment on Three runs 1-5 scale (blind evaluation)

4.6Recursion Depth

CoT

3.9764.1384.34.462Mar 25, 2026
Updated 23d ago

Evaluation Results

MethodLinks
2026.03
4.63.14.44.254.74.33
2026.03
42.94.8454.64.2