Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Human Study

Benchmarks

Task NameDataset NameSOTA ResultTrend
ReachHuman study
Human-likeness Win Rate89
6
SteeringHuman study
Human-likeness Win Rate93
6
DirectionHuman study
Human-likeness Win Rate99
6
Long JumpHuman study
Human-likeness Win Rate96
4
StrikeHuman study
Human-likeness Win Rate85
4
Physical Simulation Script GenerationHuman Study 6 diverse text prompts
Text Fidelity80.2
4
3D World GenerationHuman Study
Zoom-in Accuracy83.2
4
Video DescriptionHuman Study GEST prompts
Similarity Score56.64
3
Digital Human AnimationHuman Study 100 videos as of October 31 (test)
Overall Naturalness4.17
3
3D Object ArrangementHuman Study Evaluation Set
Win % (Instruction Following)62.1
3
Backdoor Stealthiness DetectionHuman Study (NLP Group) 100 clean code snippets, 25 injected
Precision96
3
Backdoor Stealthiness DetectionHuman Study CV Group (100 clean code snippets, 25 injected)
Precision82
3
Text-to-Video GenerationHuman Study 4K Video Generation (test)
Video Quality71.25
2
Image-to-Image GenerationHuman Study
Layout Score0.98
2
Track-Conditioned Video GenerationHuman Study Evaluation Set (val)
Metric-
0
Showing 15 of 15 rows