Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Functionally Diverse Response Generation on Functional Diversity Prompt Set Categories A-H 1.0 (test)

4.26Score Category A

GRPO w/DARLING

0.95281.81142.673.5286Sep 25, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.09
4.263.193.642.652.522.133.531.89
2025.09
42.693.933.091.161.041.671.36
2025.09
3.942.754.292.671.51.832.581.51
2025.09
3.942.253.292.752.062.222.821.64
2025.09
3.892.253.213.474.4644.363.32
2025.09
3.682.313.142.653.143.173.842.43
2025.09
3.662.563.642.731.41.872.361.45
2025.09
3.552.313.293.131.181.171.471.27
2025.09
3.531.753.142.251.521.832.161.33
2025.09
3.471.693.142.61.71.832.111.35
2025.09
3.421.562.862.381.581.832.131.28
2025.09
3.421.882.433.293.983.433.932.92
2025.09
3.231.622.072.442.662.872.891.81
2025.09
3.151.693.142.251.21.521.961.24
2025.09
2.791.622.712.841.121.091.221.12
2025.09
2.581.192.213.073.12.73.442.11
2025.09
2.431.192.292.291.51.261.621.1
2025.09
1.871.251.211.951.7822.221.33
2025.09
1.452.313.712.422.6822.511.78
2025.09
1.191.752.792.222.021.481.911.47
2025.09
1.081.251.711.931.561.521.471.09