Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

JFLEG

Benchmarks

Task NameDataset NameSOTA ResultTrend
Grammatical Error CorrectionJFLEG
GLEU63.3
47
Grammatical Error CorrectionJFLEG (test)
GLEU64.9
45
IPI sanitizationJfleg RTE (unseen)
ASR0
20
Sentence Level Quality EstimationJFLEG (test)
GLEU61.61
12
Indirect Prompt Injection DetectionJfleg RTE
Accuracy99.4
10
Grammatical Error CorrectionJFLEG (dev)
F0.5 Score63.61
7
Utility PreservationJFLEG-RTE
Win Rate16.57
5
Indirect Prompt Injection SanitizationJfleg
GCG ASR7
2
Indirect Prompt Injection AttackJfleg
ASR99.5
2
Error DetectionJFLEG (test)
Precision72.53
2
Indirect Prompt Injection DetectionJfleg
GCG Accuracy95.5
1
Showing 11 of 11 rows