Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Dialogue Generation on Human-in-the-loop Interactive Evaluation Customer-Agent Dialogs
Loading...
41
Win Rate (vs GPT-4)
GER
1.48
11.74
22
32.26
Aug 13, 2024
Win Rate (vs GPT-4)
Gain (pp)
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate (vs GPT-4)
Gain (pp)
GER
Student Model Architec...
2024.08
41
16
GER
Student Model Architec...
2024.08
41
3
GER
Student Model Architec...
2024.08
41
19
Base LLM
Student Model Architec...
2024.08
38
-
GER
Student Model Architec...
2024.08
34
31
Base LLM
Student Model Architec...
2024.08
25
-
Base LLM
Student Model Architec...
2024.08
22
-
Base LLM
Student Model Architec...
2024.08
3
-
Feedback
Search any
task
Search any
task