Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MUSTARD

Benchmarks

Task NameDataset NameSOTA ResultTrend
Table Structure RecognitionMUSTARD
S-TEDS95.3
20
Multimodal Sarcasm DetectionMUStARD original (speaker-dependent)
Precision86.8
15
Multimodal Sarcasm DetectionMUStARD speaker-independent original
F1 Score75.6
13
Sarcasm DetectionMUStARD
Accuracy76.62
13
Multimodal ClassificationMUSTARD
Accuracy69.86
13
Sarcasm DetectionMUSTARD++ (test)
F1 Score83.2
13
Sarcasm DetectionMUStARD (held-out)
F1 Score65.8
8
Sarcasm Detection (SAR)MUStARD
Weighted F10.795
7
Sarcasm UnderstandingMuSTARD++
Precision81.2
7
Sarcasm UnderstandingMuSTARD
Precision86.8
7
Multimodal Sarcasm DetectionMUStARD
Accuracy80.6
6
Multimodal Sarcasm ExplanationMUStARD (speaker dependent)
ROUGE-138.4
6
Multimodal Sarcasm DetectionMUStARD speaker dependent
Precision86.8
6
Sarcasm DetectionMUSTARD (test)
F1 Score78.7
5
Multimodal RetrievalMUStARD
Recall@10 (Q: M23, T: M1)81
4
Multi-modal ClassificationMUSTARD (test)
Accuracy72.5
4
Model SelectionMUSTARD (unseen)
Performance95.15
1
Showing 17 of 17 rows