Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Big-Bench Hard Reasoning on BBH

45.6Accuracy

MoE + L_ERC

44.24844.59944.9545.301Dec 29, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
45.6
2025.12
44.3