Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Alpaca

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingAlpaca
Speedup (x)4.13
63
Language ModelingAlpaca
Perplexity3.22
31
LLM InferenceAlpaca
Speedup2.95
21
Safety defense against harmful fine-tuning attacksAlpaca harmful subset (test)
Harmful Score26.6
21
Conversational AbilityAlpaca (test)
Alpaca LC Win Rate71.87
20
Instruction TuningAlpaca instruction-tuning 52k
Pairwise Winning Score116
19
Long-form reasoningAlpaca
Avg LogProb per Answer-1.5772
14
Prompt RecoveryAlpaca
BLEU-143.24
14
Instruction FollowingAlpaca instruction-following (test)
PPL3.85
12
Faithfulness MeasurementAlpaca
BLEU0.601
12
Instruction FollowingAlpaca (test)
Kendall's Tau4.96
11
Bit-flip Inference Cost AttackAlpaca (test)
Avg Length (Original)1,117
10
Fine-tuning RobustnessAlpaca Dataset
FSR100
10
Adaptive Care Policy LearningALPACA 1000 simulated patient rollouts
Cumulative Reward3.38
7
LLM RoutingAlpaca In-Domain
AUROC0.7202
7
Inference Cost AttackAlpaca Vicuna-7B (test)
Average Length1,874
6
Inference Cost AttackAlpaca Samantha-7B (test)
Average Length1,944
6
Inference Cost AttackAlpaca Llama2-7B (test)
Average Length191
6
Machine UnlearningAlpaca-57k (OOD)
Delta ASR41.4
6
Machine UnlearningAlpaca 57k (Seen)
Delta ASR96.7
6
Budgeted subset selectionAlpaca 5% retention
SUM157.162
6
Teacher AttributionAlpaca
Accuracy56
6
Watermark DetectionAlpaca instruction-following 52K
TPR61.67
5
Budgeted subset selectionAlpaca 15% retention (train)
Total Sum134.25
5
Safety AlignmentAlpaca 7B (test)
HV Score1.2916
5
Showing 25 of 35 rows