Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Following on User Study General
Loading...
78.36
Win Rate
Plan-Critic
64.476
68.0805
71.685
75.2895
Jan 18, 2026
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate
Plan-Critic
Comparator Method=Siren
2026.01
78.36
Plan-Critic
Comparator Method=AudioX
2026.01
70.49
Plan-Critic
Comparator Method=GenAU
2026.01
65.01
Feedback
Search any
task
Search any
task