Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PIT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Paraphrase IdentificationPIT -> PAWS (test)
AUROC77.5
20
Paraphrase IdentificationPIT Out-of-distribution from PAWS
Macro F169.8
10
Paraphrase IdentificationPIT -> PIT (test)
Macro F181
10
LocomotionPit Locomotion terrain
Success Rate (Rsucc)98.4
5
Image Semantic EmbeddingPIT (internal evaluation)
Triplet Accuracy87.16
5
Paraphrase IdentificationPIT
F1 Score0.755
2
Showing 6 of 6 rows