Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
RDF-to-text generation on WebNLG OOD standard (test)
Loading...
37.72
BLEU
Rule-based NLG (trained by Qwen 3 235B)
25.6248
28.7649
31.905
35.0451
Dec 20, 2025
BLEU
METEOR
BERTScore
BLEURT
Updated 4d ago
Evaluation Results
Method
Method
Links
BLEU
METEOR
BERTScore
BLEURT
Rule-based NLG (trained by Qwen 3 235B)
Inter-pretability=true...
2025.12
37.72
69.8
92.81
0.1645
Rule-based NLG (trained by GPT-4.1)
Inter-pretability=true...
2025.12
36.15
71.24
92.51
0.1483
Prompted Llama 3.3 70B
Inter-pretability=fals...
2025.12
33.27
69.89
92.43
0.0969
Fine-tuned BART
Inter-pretability=fals...
2025.12
30.52
63.43
91.83
-0.0261
Rule-based NLG (trained by Llama 3.3 70B)
Inter-pretability=true...
2025.12
28.58
66.06
91.87
0.0618
Rule-based NLG (trained by Qwen 2.5 72B)
Inter-pretability=true...
2025.12
26.09
64.56
91.75
0.0655
Feedback
Search any
task
Search any
task