Instruction Following

Benchmarks

Dataset Name	SOTA Method	Metric
IFEval		IFEval Accuracy95	854	29d ago
AlpacaEval 2.0	Attention-MoA	Win Rate95.87	752	1mo ago
AlpacaEval		Win Rate98.4	423	16d ago
MT-Bench	GPT-4-1106-preview	MT-Bench Score9.32	287	26d ago
Arena Hard		Win Rate98.11	263	1mo ago
InstructBench	SYTTA-8	Dolly (BLEU)75.27	224	4mo ago
UnNI	MINILLM	Rouge-L40.2	178	2mo ago
Alpaca	EAGLE3 + DM	Speedup (x)5.27	173	26d ago
AlpacaEval 2	ξ-DPO	LC (%)75.4	147	18d ago
S-NI	Adversarial Moment-Matching Distillation	Rouge-L38.7	119	4mo ago
DollyEval		Rouge-L32.5	114	4mo ago
AdvancedIF	BRAID	Accuracy71	102	18d ago
IFEval	Self Consistency (Best on Validation)	Accuracy (IFEval)90.39	101	1mo ago
Vicuna		Rouge-L20.93	101	2mo ago
AlpacaEval 2.0 (test)	Offline+Humanline (G2-9B Completions)	LC Win Rate (%)67.45	95	2mo ago
IFEval		IFEval Score94.64	92	18d ago
IFEval (test)	SPOT	IFEval Score84.8	92	1mo ago
Natural Instructions (test)	CoLoRA	Rouge-L97.9	90	4mo ago
FollowBench	ImpRIF-32B	HSR79	85	1mo ago
ALFWorld	M2CL	Accuracy89.3	82	4mo ago
DomainBench	SYTTA	Agriculture Score21.85	80	4mo ago
VicunaEval	IOA	VicunaEval Score40.75	80	4mo ago
SelfInst	Adversarial Moment-Matching Distillation	Rouge-L21.7	73	3mo ago
IFBench		Accuracy77.8	72	2mo ago
IFBench	Nemotron-Cascade-2 30B-A3B	Pass@1 (Strict)82.9	72	4mo ago

Showing 25 of 465 rows

...