Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

QQP

Benchmarks

Task NameDataset NameSOTA ResultTrend
Paraphrase IdentificationQQP
Accuracy91.7
78
Paraphrase DetectionQQP (test)
Accuracy95.3
51
Sentence-pair classificationQQP
Accuracy88.8
40
ParaphrasingQQP
BLEU33
22
Paraphrase GenerationQQP (test)
BLEU-241.74
22
Paraphrase IdentificationQQP Out-of-distribution from PAWS
Macro F170.8
20
Seq2SeqQQP
ROUGE-L66
18
Seq2Seq generationQQP
BLEU0.3142
17
Text ClassificationQQP
RDC0
16
Paraphrase IdentificationQQP few-shot zero-shot
Accuracy74
16
Paraphrase DetectionQQP source: RTE (test)
Accuracy71.5
12
ParaphrasingQQP
Semantic Faithfulness90.26
11
Paraphrase DetectionQQP
F1 Score89
10
Paraphrase IdentificationQQP Out-of-distribution from PIT
Macro F10.757
10
Paraphrase IdentificationQQP -> WMT (test)
AUROC85.1
10
Ranking correlation with full dataset evaluationQQP
Kendall Correlation0.95
10
Paraphrase DetectionQQP
Accuracy79.2
9
ClassificationQQP
ASR20
8
Paraphrase DetectionQQP IID
Accuracy84.8
8
Bias MitigationQQP
Accuracy80.1
8
Backdoor DefenseQQP
Accuracy80.76
8
Paraphrase DetectionQQP
Average Accuracy71.2
8
Paraphrase DetectionQQP In-Domain (test)
Accuracy91.66
8
Paraphrase DetectionQQP (dev)
Accuracy92.7
6
Paraphrase DetectionQQP
Total Running Time (s)8,279
5
Showing 25 of 31 rows