Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Prompt Injection Detection on VPI-Bench
Loading...
87.58
Recall
WebAgentGuard-8B
-3.5032
20.1434
43.79
67.4366
Apr 14, 2026
Recall
Updated 4d ago
Evaluation Results
Method
Method
Links
Recall
WebAgentGuard-8B
Model Type=Ours
2026.04
87.58
WebAgentGuard-4B
Model Type=Ours
2026.04
85.95
GPT-4o
Model Type=Closed-sour...
2026.04
78.43
GPT-4o-Mini
Model Type=Closed-sour...
2026.04
69.93
GPT-4.1
Model Type=Closed-sour...
2026.04
69.61
Llama-Guard-3-Vision-11B
Model Type=Guard models
2026.04
68.95
Prompt-Guard-1-86M
Model Type=Guard models
2026.04
64.38
Llama-3.2-Vision-Instruct-11B
Model Type=Open-source...
2026.04
43.79
Qwen3-VL-Instruct-8B
Model Type=Open-source...
2026.04
42.16
Qwen3-VL-Instruct-4B
Model Type=Open-source...
2026.04
40.52
Qwen2.5-VL-Instruct-7B
Model Type=Open-source...
2026.04
14.71
GuardReasoner-VL-7B
Model Type=Guard models
2026.04
4.9
Prompt-Guard-2-86M
Model Type=Guard models
2026.04
0
Feedback
Search any
task
Search any
task