Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

curated dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Malicious Pickle DetectionCurated Dataset standard (train-test)
TPR100
11
Jailbreak evaluationcurated dataset (test)
BAD BOT Rate0
11
Action-conditioned 4D scene generationCurated dataset of 10 scenes (test)
Camera Control93.26
8
Action-conditioned 4D scene generationCurated dataset of 10 scenes 1.0 (test)
Physics Plausibility93.5
7
Zero-shot Text-guided Video EditingCurated dataset 90-frames
CLIP-F95.99
7
Overall Appearance Transfer QualityCurated dataset 100 image pairs
DeQA4.1728
6
Material TransferCurated dataset 100 image pairs
CLIP-T Score0.2927
6
Semantic-Aware Appearance TransferCurated dataset 100 image pairs
CLIP-I88.32
6
Zero-shot Text-guided Video EditingCurated dataset 8-frames
CLIP-F95.95
6
Zero-shot Text-guided Video EditingCurated dataset 36-frames
CLIP-F9,318
5
Physical 3D action-conditioned video generationCurated dataset of 30 images (test)
Action Following89.6
3
Showing 11 of 11 rows