Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

P-Soups

Benchmarks

Task NameDataset NameSOTA ResultTrend
Response SelectionP-Soups Expertise
Accuracy83.66
16
Response SelectionP-Soups Style
Accuracy0.88
16
Response SelectionP-Soups Informativeness
Accuracy78.07
16
Personality AlignmentP-SOUPS
Expertise52.8
7
Showing 4 of 4 rows