Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Causal Judgment (test)
Loading...
76.3
Accuracy
AMPLIFY
48.948
56.049
63.15
70.251
May 19, 2023
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
AMPLIFY
Model=GPT-3.5, Prompti...
2023.05
76.3
Human-Rater
Model=Human, Prompting...
2023.05
69.6
GPT-3.5
Model=GPT-3.5, Prompti...
2023.05
63.1
SOTA
Model=N/A, Prompting S...
2023.05
62.1
AMPLIFY
Model=GPT-3, Prompting...
2023.05
60.5
GPT-3.5
Model=GPT-3.5, Prompti...
2023.05
57.8
GPT-3
Model=GPT-3, Prompting...
2023.05
55.2
GPT-3
Model=GPT-3, Prompting...
2023.05
55.2
Random
Model=N/A, Prompting S...
2023.05
50
Feedback
Search any
task
Search any
task