Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Premise Generation on Crafted Rules Dataset (test)
Loading...
0.411
BLEU
Engine
0.07092
0.15921
0.2475
0.33579
Feb 18, 2024
BLEU
Self-BLEU
Fact Count
Updated 4d ago
Evaluation Results
Method
Method
Links
BLEU
Self-BLEU
Fact Count
Engine
backbone=Mistral-7b, m...
2024.02
0.411
0.687
3.42
GPT-4
mode=prompting
2024.02
0.149
0.805
2.58
GPT-3.5
mode=prompting
2024.02
0.084
0.739
1.72
Feedback
Search any
task
Search any
task