Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on BBH (unseen)
Loading...
42.38
Total Average Score
COT-T5-11B
26.1976
30.3988
34.6
38.8012
May 23, 2023
Total Average Score
Accuracy
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
Total Average Score
Accuracy
EM
COT-T5-11B
Zero-shot=true, Model...
2023.05
42.38
-
-
FLAN-T5-11B
Zero-shot=true, Model...
2023.05
39.78
-
-
T5-11B + COT FT
Zero-shot=true, Model...
2023.05
39.54
-
-
GPT-3 (175B)
Zero-shot=true, Model...
2023.05
38.3
-
-
COT-T5-3B
Zero-shot=true, Model...
2023.05
37.29
-
-
T5-3B + COT FT
Zero-shot=true, Model...
2023.05
36.74
-
-
FLAN-T5-3B
Zero-shot=true, Model...
2023.05
35.6
-
-
T0-11B
Zero-shot=true, Model...
2023.05
32.7
-
-
TK-INSTRUCT-11B
Zero-shot=true, Model...
2023.05
32.16
-
-
TK-INSTRUCT-3B
Zero-shot=true, Model...
2023.05
29.88
-
-
T0-3B
Zero-shot=true, Model...
2023.05
27.05
-
-
T5-LM-3B
Zero-shot=true, Model...
2023.05
26.82
-
-
Feedback
Search any
task
Search any
task