Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Task-Focused Dialogue on Multiwoz (G-Eval)
Loading...
3.453
G-Eval Score
GPT-5.2
3.349
3.376
3.403
3.43
Jan 24, 2026
G-Eval Score
Updated 4d ago
Evaluation Results
Method
Method
Links
G-Eval Score
GPT-5.2
2026.01
3.453
GOPO-Qwen3-14B
Number of parameters=14B
2026.01
3.447
Gemini-2.5
2026.01
3.407
GLM-4.7
2026.01
3.381
DeepSeek-R1
2026.01
3.368
Qwen-235B
Number of parameters=235B
2026.01
3.353
Feedback
Search any
task
Search any
task