Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning and Language Understanding on BigBench Emergent Suite (BBES)
Loading...
67
Navigate
U-PaLM 540B
54.832
57.991
61.15
64.309
Oct 20, 2022
Navigate
StrategyQA
CrassAI
LogicalSequence
VitamincFactVerification
UnderstandingFables
IdentifyOddMetaphor
Hyperbaton
CausalJudgment
EnglishProverbs
GeometricShapes
PhysicsQuestions
Snarks
AnalogicalSimilarity
IPA_NLI
MovieDialogSameOrDifferent
TimeDial
QuestionSelection
LogicalFallacyDetection
UnitInterpretation
LanguageIdentification
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Navigate
StrategyQA
CrassAI
LogicalSequence
VitamincFactVerification
UnderstandingFables
IdentifyOddMetaphor
Hyperbaton
CausalJudgment
EnglishProverbs
GeometricShapes
PhysicsQuestions
Snarks
AnalogicalSimilarity
IPA_NLI
MovieDialogSameOrDifferent
TimeDial
QuestionSelection
LogicalFallacyDetection
UnitInterpretation
LanguageIdentification
Average Score
U-PaLM 540B
Prompting=5-shot, Numb...
2022.10
67
78.3
100
86.5
73.9
78.4
87.5
59.9
68.4
87.5
49.3
12.5
86.1
37.5
68
68.8
81.2
59.8
81.4
51
38.9
67.7
PaLM 540B
Prompting=5-shot, Numb...
2022.10
55.3
73.9
97.7
92.3
70.2
75.7
87.2
54.2
65.3
91.2
44
7.6
69.1
36.5
65.9
64.8
78.3
54.8
80.3
47
36
64.3
Feedback
Search any
task
Search any
task