Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Functionally Diverse Response Generation on Task Category E
Loading...
4.94
Functionally Diverse Responses
claude-3.5-sonnet
4.6904
4.7552
4.82
4.8848
Sep 25, 2025
Functionally Diverse Responses
Updated 1mo ago
Evaluation Results
Method
Method
Links
Functionally Diverse Responses
claude-3.5-sonnet
Sampling Strategy=In-C...
2025.09
4.94
Mistral-7B-Instruct-v0.3
Sampling Strategy=Syst...
2025.09
4.88
Llama-3.1-8B-Instruct
Sampling Strategy=Syst...
2025.09
4.7
Feedback
Search any
task
Search any
task