Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Chain-of-thought Reasoning on BBH (CoT)
Loading...
54.42
Accuracy (BBH CoT)
Baseline
54.0352
54.1351
54.235
54.3349
May 28, 2026
Accuracy (BBH CoT)
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy (BBH CoT)
Baseline
2026.05
54.42
InFamilySteer
2026.05
54.26
DenseSteer
2026.05
54.05
Feedback
Search any
task
Search any
task