Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Functionally Diverse Response Generation on Task Category F
Loading...
4.13
Functionally Diverse Responses Count
Llama-3.1-8B-Instruct
2.9028
3.2214
3.54
3.8586
Sep 25, 2025
Functionally Diverse Responses Count
Updated 1mo ago
Evaluation Results
Method
Method
Links
Functionally Diverse Responses Count
Llama-3.1-8B-Instruct
Sampling Strategy=Syst...
2025.09
4.13
gpt-4o
Sampling Strategy=Syst...
2025.09
2.95
Feedback
Search any
task
Search any
task