Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Style Transfer on Anime-style dialogue evaluation set (test)
Loading...
4.4
Semantic Similarity
Baseline D
1.5192
2.2671
3.015
3.7629
Mar 6, 2026
Semantic Similarity
Style Logic Score
Naturalness Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Semantic Similarity
Style Logic Score
Naturalness Score
Baseline D
mode=Prompt-based unde...
2026.03
4.4
3.89
4.03
Model v2
inference-only=true
2026.03
4.29
2.86
3
Model v2
2026.03
3.87
2.61
2.73
Model v1
2026.03
3.75
2.57
2.72
Baseline C
methodology=Vanilla SFT
2026.03
3.12
2.51
2.6
Baseline B
2026.03
3
2.41
2.5
Baseline A
methodology=RAG
2026.03
1.63
2.28
2.53
Feedback
Search any
task
Search any
task