Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DREAM

Benchmarks

Task NameDataset NameSOTA ResultTrend
Machine Reading ComprehensionDREAM (test)
Accuracy91.8
23
Deepfake video photorealism assessmentDREAM Overall Average
PLCC (Overall Average)0.976
22
Deepfake video photorealism assessmentDREAM (test-3)
PLCC0.975
22
Deepfake video photorealism assessmentDREAM (test-2)
PLCC0.976
22
Deepfake video photorealism assessmentDREAM (Test-1)
PLCC97.7
22
Multiple Choice Question AnsweringDREAM
Accuracy98.77
22
Machine Reading ComprehensionDREAM (dev)
Accuracy90
21
Dialogue-based Multiple-choice Question AnsweringDREAM (test)
Accuracy91.8
21
Robot Pose EstimationDREAM-real Panda 3CAM-AK
AUC90.2
19
Video Captioning / SummarizationDream 1k
Rouge-L20.8
15
Dialogue ComprehensionDREAM
Accuracy69.2
15
Safety evaluation against dynamic adversarial chainsDREAM (test)
Overall Defense Score67.3
12
Robot Pose EstimationDREAM-real Panda ORB
AUC87.6
12
Robot Pose EstimationDREAM-real Panda 3CAM-RS
AUC91.9
12
Fine-grained captioningDream1k
F1 Score29.5
11
Video DescriptionDREAM-1K Overall 1.0 (test)
F1 Score40.1
11
Video DescriptionDREAM-1K Stock 1.0 (test)
F1 Score44
11
Video DescriptionDREAM-1K Shorts 1.0 (test)
F1 Score40.9
11
Video DescriptionDREAM-1K YouTube 1.0 (test)
F1 Score34.5
11
Video DescriptionDREAM-1K Animation 1.0 (test)
F137.1
11
Video CaptioningDream-1K
Precision36
10
Dialogue-based Multiple-choice Question AnsweringDREAM (dev)
Accuracy89.9
10
Quality-Penalized Efficiency EvaluationDream Base
QPS (gamma=4)2.03
9
Question AnsweringDREAM
Accuracy69.51
9
Robot Pose EstimationDREAM Baxter DR
AUC75.5
8
Showing 25 of 46 rows