Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLaVA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text Membership Inference AttackLLaVA LLM Pre-training
AUC0.688
88
Text Membership Inference AttackLLaVA VLLM Tuning
AUC0.993
44
Multimodal UnderstandingLLaVA Evaluation Suite GQA, MMB, MMB-CN, MME, POPE, SQA, VQAV2, VQAText, VizWiz
GQA64.2
41
Multimodal UnderstandingLLaVA Evaluation Suite 1.5
GQA63.2
32
Visual Instruction FollowingLLaVA-W
Score102
28
Multimodal Large Language Model Inference EfficiencyLLaVA 13B 1.5 (test)
TTFT (ms)60.2
21
Hallucination detectionllava
AUC ROC96.5
19
Vision-Language Understanding and ReasoningLLaVA Multimodal Evaluation Suite (GQA, MMBench, MME, POPE, ScienceQA, VQAv2, TextVQA, SEED-Bench, MM-Vet, VizWiz) 1.5 (test/val)
GQA0.619
16
Large Vision-Language Model evaluationLLaVA Evaluation Suite (MMBench, MME, MM-Vet, ScienceQA) 1.5 (test val)
MMBench68.5
16
Jailbreak DetectionLLaVA Vicuna-7B v1.6
Accuracy92
13
Image CaptioningMC-LLaVA
Caption Recall (Single)83.6
11
Vision-LanguageLLaVa 1.5
GQA Score63.01
11
Vision UnderstandingLLaVA-W
Score63
10
Large Vision-Language Model EvaluationLLAVA (bench)
Score77.8
10
Adversarial AttackLLaVA
CLIP Similarity (RN-50)0.2427
9
Pointwise ScoringLLaVA-W pointwise
Kendall's Tau0.949
9
Multimodal Instruction FollowingLLaVA Wilder
Score92
9
Inference EfficiencyLLaVA 7B 1.5
Latency (ms)802.65
8
Vision UnderstandingLLaVA-Wild
LLaVA-Wild Accuracy74.2
8
Communication Cost AnalysisLLaVA 1.5
Total Latency (s)97.268
7
Vision-Language EvaluationLLaVA-Wilder
Accuracy83.7
7
Open-ended Visual Question AnsweringLLaVA Eval v1 (test)
Conversation Score77.67
7
Black-Box Adversarial AttackLLaVA 1.5
KMR (a)0.96
6
Inference EfficiencyLLaVA-NeXT Inference
Inference Time (s)7.998
6
Knowledge TransferLLaVA Evaluation Suite Flickr30k 1.5
VQAv2 Accuracy78.52
6
Showing 25 of 37 rows