Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BBH (Big-Bench-Hard)

Benchmarks

Task NameDataset NameSOTA ResultTrend
General ReasoningBBH (Big-Bench-Hard) (test)
Accuracy81.8
20
Showing 1 of 1 rows