Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ShopBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal UnderstandingShopBench
ShopFront Score73.8
18
Defense effectiveness against memory-mediated attacksShopBench-Agent
ASR-M8.3
10
Adversarial Attack on Memory-mediated AgentsShopBench Agent
ASR-M82.3
6
Memory-mediated Attack DetectionShopBench
ASR-M82.3
5
Showing 4 of 4 rows