Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Feedback Evaluation Alignment on FLASK
Loading...
0.405
Kendall's Tau
Prometheus-2-7B
0.09716
0.17708
0.257
0.33692
Mar 6, 2025
Kendall's Tau
Updated 4d ago
Evaluation Results
Method
Method
Links
Kendall's Tau
Prometheus-2-7B
CoT=true, Training obj...
2025.03
0.405
Mistral-7B-Instruct + RAFT
CoT=false, Training ob...
2025.03
0.375
TRACT
CoT=true, Training obj...
2025.03
0.373
Mistral-7B-Instruct + CE (GPT-4 CoT)
CoT=true, Training obj...
2025.03
0.328
Mistral-7B-Instruct + CE (GPT-4 Score)
CoT=false, Training ob...
2025.03
0.294
Mistral-7B-Instruct + RAIL Baseline
CoT=false, Inference m...
2025.03
0.109
Feedback
Search any
task
Search any
task