Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on BigBench Extra Hard
Loading...
14.3
mean@4
ReSyn
8.6112
10.0881
11.565
13.0419
Feb 23, 2026
mean@4
Updated 4d ago
Evaluation Results
Method
Method
Links
mean@4
ReSyn
2026.02
14.3
Majority
2026.02
13.1
Qwen 2.5-Instruct
2026.02
11.2
Chance
2026.02
8.83
Feedback
Search any
task
Search any
task