Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Humor-helpful

Benchmarks

Task NameDataset NameSOTA ResultTrend
Generalization to Unseen PreferencesHumor-helpful
Generalization Score (Group 1)17.034
2
ControllabilityHumor-helpful Group 4 (unseen)
Kendall's tau1
2
ControllabilityHumor-helpful Group 3 (unseen)
Kendall's tau1
2
ControllabilityHumor-helpful Group 2 (unseen)
Kendall's Tau1
2
ControllabilityHumor-helpful Group 1 (unseen)
Kendall's Tau1
2
Showing 5 of 5 rows