Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Cybersecurity Domain Evaluation (Functional Efficacy, Safety, Robustness, Quality)
Loading...
33.5
Functional Efficacy & Logic
CASTER
32.46
32.73
33
33.27
Jan 27, 2026
Functional Efficacy & Logic
Safety & Ethical Compliance
Robustness & Automation
Quality & Cleanliness
Updated 4d ago
Evaluation Results
Method
Method
Links
Functional Efficacy & Logic
Safety & Ethical Compliance
Robustness & Automation
Quality & Cleanliness
CASTER
2026.01
33.5
25.9
17.1
8.5
FrugalGPT
2026.01
32.5
23.6
16.5
8.2
Feedback
Search any
task
Search any
task