Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Task Category

Benchmarks

Task NameDataset NameSOTA ResultTrend
Functionally Diverse Response GenerationTask Category B
Functionally Diverse Responses Count5
6
Functionally Diverse Response GenerationTask Category E
Functionally Diverse Responses4.94
3
Functionally Diverse Response GenerationTask Category H
Functionally Diverse Responses4.42
2
Functionally Diverse Response GenerationTask Category G
Functionally Diverse Responses Count4.62
2
Functionally Diverse Response GenerationTask Category F
Functionally Diverse Responses Count4.13
2
Showing 5 of 5 rows