Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Jigsaw

Benchmarks

Task NameDataset NameSOTA ResultTrend
Performance EstimationJigSaw
MAE0.001
198
Visual ReasoningJigsaw
Accuracy88.6
40
DetoxificationJigsaw (test)
Perplexity (PPL)20.8
29
Spatial ConfigurationJigsaw
Metric 299
12
Binary classificationjigsaw
ROC AUC0.97
11
Toxicity ClassificationJigsaw dataset
Rescue Rate44.2
9
Fairness EvaluationJigsaw
BiasAUC75.6
9
ClassificationJigsaw Text + Tabular
Accuracy95.94
8
Binary ClassificationToxic Jigsaw
Competition Score0.987
7
Toxicity DetectionJigsaw Perspective-based Negated Private (test)
Accuracy87
7
Fairness-aware ClassificationJigsaw
Training Time (min)30
7
Visual ManipulationJigsaw Res5
Accuracy4.3
6
Visual ManipulationJigsaw Res4
Accuracy11.3
6
Visual ManipulationJigsaw Res3
Accuracy21
6
Toxicity classificationJigsaw (test)
Accuracy96
6
Visual puzzle solvingJigsaw R1 (test)
Accuracy (2x1)61.9
6
Part-based Image GenerationJigsaw
FID160.1
5
Alignment AuditJigsaw Toxic Comment
Average Treatment Effect (ATE)0
5
Toxicity ClassificationJigsaw-ML
AUC98.4
2
Toxicity ClassificationJigsaw-BL
AUC97.1
2
Multi-label Toxic Content ClassificationJigsaw-ML
Attack Success Rate71.7
2
Binary Toxic Content ClassificationJigsaw-BL
Attack Success Rate99.27
2
Showing 22 of 22 rows