Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Computer-Using Agent Task on WASP 1.0 (test)
Loading...
97.6
PCR
SAFEPRED
53.088
64.644
76.2
87.756
Feb 2, 2026
PCR
SR
SUP
GPI
GUI
RPI
RUI
Overall Score
Updated 4d ago
Evaluation Results
Method
Method
Links
PCR
SR
SUP
GPI
GUI
RPI
RUI
Overall Score
SAFEPRED
Foundation Model=GPT-4...
2026.02
97.6
97.6
96.4
-
-
-
-
-
HarmonyGuard
Foundation Model=GPT-4...
2026.02
92.9
76.2
67.9
-
-
-
-
-
Rule-traversed
Foundation Model=GPT-4o
2026.02
89.3
61.9
58.3
-
-
-
-
-
Generic-defense
Foundation Model=GPT-4o
2026.02
77.4
54.8
54.8
-
-
-
-
-
None
Foundation Model=GPT-4o
2026.02
54.8
63.1
51.2
-
-
-
-
-
Feedback
Search any
task
Search any
task