Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Violation Scenario Generation on Scenario S3
Loading...
4.11
Mean Score
LawBreaker
4.018
4.639
5.26
5.881
Feb 5, 2026
Mean Score
Max Score
High Violation Rate
Violation Rate (>6 Threshold)
Violation Rate (>8 Threshold)
Violation Count (>10 Threshold)
Updated 4d ago
Evaluation Results
Method
Method
Links
Mean Score
Max Score
High Violation Rate
Violation Rate (>6 Threshold)
Violation Rate (>8 Threshold)
Violation Count (>10 Threshold)
LawBreaker
2026.02
4.11
9
0.22
0.14
0
-
ABLE
2026.02
5.94
16
0.39
0.19
0.05
-
ROMAN
2026.02
6.41
18
0.43
0.24
0.09
-
Feedback
Search any
task
Search any
task