Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Generation on Norm-grounded Dialogue English (test)
Loading...
65
Win Rate vs. NormDial
GPT-4o-mini
50.44
54.22
58
61.78
Sep 22, 2025
Win Rate vs. NormDial
Win Rate vs. SODA
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Rate vs. NormDial
Win Rate vs. SODA
GPT-4o-mini
Model=GPT-4o-mini
2025.09
65
65
LLaMA-3-8B
Model=LLaMA-3-8B
2025.09
65
59
Qwen-2.5-32B
Model=Qwen-2.5-32B
2025.09
56
62
Qwen-2.5-14B
Model=Qwen-2.5-14B
2025.09
51
61
Feedback
Search any
task
Search any
task