Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RTC-BENCH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Indirect Prompt Injection Red-teamingRTC-BENCH Aggregated (OwnCloud, Reddit, RocketChat)
ASR (Avg)66.19
7
Indirect Prompt Injection DetectionRTC-BENCH a11y Tree G.1 Experimental Setting 1.0
Detection Accuracy28
4
Indirect Prompt Injection DetectionRTC-BENCH Screenshot 1.0
Detection Accuracy30
3
Showing 3 of 3 rows