Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Privileged Knowledge Recall on TOFU
Loading...
98.3
ROUGE-L Recall
TOFU SFT
25.292
44.246
63.2
82.154
Oct 18, 2024
ROUGE-L Recall
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE-L Recall
TOFU SFT
Backbone=Llama3-8B-Ins...
2024.10
98.3
SUDOLM TOFU
Backbone=Llama3-8B-Ins...
2024.10
97.6
TOFU SFT
Backbone=Llama2-13B, T...
2024.10
96.3
SUDOLM TOFU
Backbone=Llama2-13B, A...
2024.10
95.8
TOFU SFT
Backbone=Llama2-7B, Tr...
2024.10
94.7
SUDOLM TOFU
Backbone=Llama2-7B, Ac...
2024.10
93.3
Llama3-8B-Instruct
Backbone=Llama3-8B-Ins...
2024.10
32.2
Llama2-13B
Backbone=Llama2-13B
2024.10
31.7
Llama2-7B
Backbone=Llama2-7B
2024.10
28.1
Feedback
Search any
task
Search any
task