Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Alpaca and AdvBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety DetectionAlpaca and AdvBench Conversation Level (test)
MCA Accuracy100
7
Safety DetectionAlpaca and AdvBench Prompt Level (test)
Accuracy (MCA)100
7
Showing 2 of 2 rows