Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Grounded Text Generation on RAGTruth
Loading...
33.14
F1 Score
Token-Guard
11.2688
16.9469
22.625
28.3031
Jan 29, 2026
F1 Score
BLEU Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
BLEU Score
Token-Guard
Backbone=13B
2026.01
33.14
30.41
Tree-of-Thought
Backbone=13B
2026.01
32.93
23.7
Chain-of-Thoughts
Backbone=3B
2026.01
32.7
25.48
Chain-of-Thoughts
Backbone=13B
2026.01
32.25
27.67
Guided Decoding
Backbone=3B
2026.01
30.5
18.7
Guided Decoding
Backbone=13B
2026.01
29.34
22.01
Tree-of-Thought
Backbone=3B
2026.01
27.05
21.16
BaseModel
Backbone=13B
2026.01
22.47
15.26
Predictive Decoding
Backbone=3B
2026.01
21.72
14.59
Token-Guard
Backbone=3B
2026.01
15.66
13.45
BaseModel
Backbone=3B
2026.01
12.11
6.29
Feedback
Search any
task
Search any
task