Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Safety-Utility Trade-off Evaluation benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Safety-Utility Trade-off Evaluation
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
S-Eval, ORFuzzSet, and NQ Aggregated
LLM-VA
F1 Score
86.81
72
1mo ago
Aggregate (Koala, JBench-B, GSM-8k, SQL-1k, OrBench-H, SorryBench, JBench-H, HEX-PHI)
Db as Alpaca
Average Score
77.03
6
1mo ago
Showing 2 of 2 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task