Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CFH-Hard

Benchmarks

Task NameDataset NameSOTA ResultTrend
Accidental DisclosureCFH-Hard Accidental
Accuracy (CFH-Hard Accidental)89
8
Computer Use Control-Flow HijackingCFH-Hard Computer Use
Gen. Rate67
8
Indirect Prompt Injection AttackCFH-Hard Computer Use
Attack Success Rate (IA)69
8
Showing 3 of 3 rows