Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Dialog on Transportation & Travel Out-of-Domain
Loading...
90.5
Accuracy
DFPO
27.58
43.915
60.25
76.585
Feb 5, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DFPO
Training Steps=1000
2026.02
90.5
Baseline
Training Steps=1000
2026.02
84.5
Reinforce++
Training Steps=1000
2026.02
76
PPO
Training Steps=1000
2026.02
64.7
Dr.GRPO
Training Steps=1000
2026.02
49.87
GRPO
Training Steps=1000
2026.02
30
Feedback
Search any
task
Search any
task