Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Chain-of-Thought Generation on SVAMP (test)
Loading...
4.63
GPT-4 Score
Gold
2.2068
2.8359
3.465
4.0941
Mar 5, 2024
GPT-4 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
GPT-4 Score
Gold
Self-consistency=false
2024.03
4.63
Gold
Self-consistency=true
2024.03
4.43
MI-based distillation
Self-consistency=true
2024.03
2.72
DSS
Self-consistency=true
2024.03
2.53
DSS
Self-consistency=false
2024.03
2.5
MI-based distillation
Self-consistency=false
2024.03
2.3
Feedback
Search any
task
Search any
task