Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Evaluation dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Compositional GeneralizationEvaluation Dataset (Unseen Average)
Score42.86
18
Compositional GeneralizationEvaluation Dataset Seen Average
Score62.34
18
Compositional GeneralizationEvaluation Dataset Unseen (Fold 3)
Score0.4022
18
Compositional GeneralizationEvaluation Dataset (Fold 3 Seen)
Score66.69
18
Compositional GeneralizationEvaluation Dataset Unseen (Fold 2)
Score50
18
Compositional GeneralizationEvaluation Dataset (Fold 2 Seen)
Score63.63
18
Compositional GeneralizationEvaluation Dataset Unseen (Fold 1)
Score0.4818
18
Compositional GeneralizationEvaluation Dataset (Fold 1 Seen)
Score0.6191
18
Compositional GeneralizationEvaluation Dataset (Full)
Score0.6379
18
Malicious Package DetectionEvaluation Dataset
Accuracy99.5
11
Global 3D EditingEvaluation dataset unseen 3D assets (test)
CLIP Similarity0.272
6
Local 3D EditingEvaluation dataset unseen 3D assets (test)
CLIP Similarity0.292
6
Showing 12 of 12 rows