Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Word Deletion Robustness on ELI5 prompts 32-bit payload Llama3.1-8B (test)

70.7Bit Accuracy (10% Deletion)

Ours+

50.31655.60860.966.192May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
70.763.257.954.152.5
2026.05
66.460.356.453.952
2026.05
66.460.857.15451.9
2026.05
60.957.254.452.952.4
2026.05
60.656.954.652.551.5
2026.05
53.952.150.550.650
2026.05
51.150.850.15050.2