Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PERSUADE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Judge PerformancePERSUADE
CCC0.955
32
Non-Agentic Performance EvaluationPersuade (test)
Mean Score53.2
4
Safety EvaluationPersuade
Cost per Accuracy Point ($)0.0008
4
Showing 3 of 3 rows