Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Sarcasm Detection on MUStARD (held-out)
Loading...
65.8
F1 Score
OMNISAPIENS-7B SFT
38.448
45.549
52.65
59.751
Oct 6, 2025
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
OMNISAPIENS-7B SFT
fine-tuning epochs=1,...
2025.10
65.8
Qwen 2.5-Omni-7B
fine-tuning epochs=0,...
2025.10
65.6
OMNISAPIENS-7B RL
zero-shot=true
2025.10
59.6
Gemma-3-4B
fine-tuning epochs=0,...
2025.10
52.9
Qwen-2.5-VL-7B
fine-tuning epochs=0,...
2025.10
51.1
Qwen 2.5-Omni-7B SFT
fine-tuning epochs=1,...
2025.10
47.3
Qwen 2.5-Omni-7B
zero-shot=true
2025.10
44.5
HumanOmniV2-7B
fine-tuning epochs=0,...
2025.10
39.5
Feedback
Search any
task
Search any
task