Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning Quality Assessment on Three runs 1-5 scale (blind evaluation)
Loading...
4.6
Recursion Depth
CoT
3.976
4.138
4.3
4.462
Mar 25, 2026
Recursion Depth
Dormant Thought Management
Cross-Domain Synthesis
Memory Utilisation
Structured Output
Solution Quality
Overall Score
Updated 23d ago
Evaluation Results
Method
Method
Links
Recursion Depth
Dormant Thought Management
Cross-Domain Synthesis
Memory Utilisation
Structured Output
Solution Quality
Overall Score
CoT
2026.03
4.6
3.1
4.4
4.2
5
4.7
4.33
EMoT
2026.03
4
2.9
4.8
4
5
4.6
4.2
Feedback
Search any
task
Search any
task