Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Computer-Using Agent Task on WASP 1.0 (test)
Loading...
97.6
PCR
SAFEPRED
53.088
64.644
76.2
87.756
Feb 2, 2026
PCR
SR
SUP
GPI
GUI
RPI
RUI
Overall Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
PCR
SR
SUP
GPI
GUI
RPI
RUI
Overall Score
SAFEPRED
Foundation Model=GPT-4...
2026.02
97.6
97.6
96.4
-
-
-
-
-
HarmonyGuard
Foundation Model=GPT-4...
2026.02
92.9
76.2
67.9
-
-
-
-
-
Rule-traversed
Foundation Model=GPT-4o
2026.02
89.3
61.9
58.3
-
-
-
-
-
Generic-defense
Foundation Model=GPT-4o
2026.02
77.4
54.8
54.8
-
-
-
-
-
None
Foundation Model=GPT-4o
2026.02
54.8
63.1
51.2
-
-
-
-
-
Feedback
Search any
task
Search any
task