Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Functionally Diverse Response Generation on Task Category D
Loading...
4.44
Functionally Diverse Responses
Mistral-7B-Instruct-v0.3
1.58
2.3225
3.065
3.8075
Sep 25, 2025
Functionally Diverse Responses
Updated 1mo ago
Evaluation Results
Method
Method
Links
Functionally Diverse Responses
Mistral-7B-Instruct-v0.3
Sampling Strategy=Temp...
2025.09
4.44
Llama-3.1-8B-Instruct
Sampling Strategy=Syst...
2025.09
2.75
claude-3.5-sonnet
Sampling Strategy=In-C...
2025.09
2.49
gpt-4o
Sampling Strategy=Syst...
2025.09
1.69
Feedback
Search any
task
Search any
task