Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Stylized Dialogue Generation on 148-query (test)
Loading...
0.904
CS-SB1 Score
Base
0.77816
0.81083
0.8435
0.87617
May 27, 2026
CS-SB1 Score
CS-SB2 Score
CS-SB3 Score
CS-SB4 Score
Updated 6d ago
Evaluation Results
Method
Method
Links
CS-SB1 Score
CS-SB2 Score
CS-SB3 Score
CS-SB4 Score
Base
Decoding Temperature (...
2026.05
0.904
0.831
0.766
0.71
SFT
Decoding Temperature (...
2026.05
0.798
0.655
0.542
0.452
SFR
Decoding Temperature (...
2026.05
0.783
0.628
0.509
0.418
Feedback
Search any
task
Search any
task