Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Constraint Following on CELLO
Loading...
65.7
Accuracy
REWARDAGENT_LLAMA-OP
50.932
54.766
58.6
62.434
Feb 26, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
REWARDAGENT_LLAMA-OP
DPO Training Data=On-P...
2025.02
65.7
Original UF
DPO Training Data=Orig...
2025.02
62
ArmoRM-UF
DPO Training Data=Ultr...
2025.02
60.8
REWARDAGENT_LLAMA-UF
DPO Training Data=Ultr...
2025.02
60.1
ArmoRM-OP
DPO Training Data=On-P...
2025.02
58.1
Zephyr-7b-sft-full
DPO Training Data=None...
2025.02
51.5
Feedback
Search any
task
Search any
task