Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Q-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Adversarial AttackQ-Bench
Attack Success Rate87.23
37
Vision Question AnsweringQ-Bench LLVisionQA 1.0 (dev)
Yes-or-No Score80.01
20
Video Quality UnderstandingQ-Bench-Video (dev)
Yes-or-No Acc76.78
14
Image Quality UnderstandingQ-Bench subset (dev)
Yes/No Accuracy85.82
14
Low-level Vision EvaluationQ-Bench (test)
Overall Score63.6
11
Visual Difference DiscernmentQ-Bench2
Overall Score74.2
9
Multi-modal UnderstandingQ-Bench (test)
Overall Score62.9
8
Multi-image Multi-modal UnderstandingQ-Bench
Accuracy74.4
2
Showing 8 of 8 rows