Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DREAM

Benchmarks

Task NameDataset NameSOTA ResultTrend
Machine Reading ComprehensionDREAM (test)
Accuracy91.8
23
Multiple Choice Question AnsweringDREAM
Accuracy98.77
22
Machine Reading ComprehensionDREAM (dev)
Accuracy90
21
Dialogue-based Multiple-choice Question AnsweringDREAM (test)
Accuracy91.8
21
Dialogue ComprehensionDREAM
Accuracy69.2
15
Safety evaluation against dynamic adversarial chainsDREAM (test)
Overall Defense Score67.3
12
Robot Pose EstimationDREAM-real Panda ORB
AUC87.6
12
Robot Pose EstimationDREAM-real Panda 3CAM-RS
AUC91.9
12
Robot Pose EstimationDREAM-real Panda 3CAM-AK
AUC90.2
12
Fine-grained captioningDream1k
F1 Score29.5
11
Video DescriptionDREAM-1K Overall 1.0 (test)
F1 Score40.1
11
Video DescriptionDREAM-1K Stock 1.0 (test)
F1 Score44
11
Video DescriptionDREAM-1K Shorts 1.0 (test)
F1 Score40.9
11
Video DescriptionDREAM-1K YouTube 1.0 (test)
F1 Score34.5
11
Video DescriptionDREAM-1K Animation 1.0 (test)
F137.1
11
Video CaptioningDream-1K
Precision36
10
Dialogue-based Multiple-choice Question AnsweringDREAM (dev)
Accuracy89.9
10
Quality-Penalized Efficiency EvaluationDream Base
QPS (gamma=4)2.03
9
Question AnsweringDREAM
Accuracy69.51
9
Query-based dialogue summarizationDREAM (test)
Accuracy (Multi-Choice)65.9
8
Robot Pose EstimationDREAM Panda Photo
AUC82
5
Robot Pose EstimationDREAM Panda DR
AUC82.9
5
Robot Pose EstimationDREAM-real (All)
AUC85.962
5
Video DescriptionDREAM-1K (300 randomly sampled videos)
Tarsier Wins Rate0.717
4
Panda Arm Pose EstimationDREAM Mini panda_orb_full_view
Average Error0.416
3
Showing 25 of 37 rows