Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Instruction Following

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingInstruction-Following (val)
ROUGE-L31.1
33
Preference ModelingInstruction Following
Accuracy65.2
20
Instruction FollowingInstruction Following (test)
NMSE0.9
11
Instruction FollowingInstruction Following
LC0.2027
10
ReasoningInstruction following
Normalized Score100
9
Instruction-FollowingInstruction-Following Alpaca-V2 Arena-Hard
Alpaca V2 Score9.2
6
Instruction FollowingInstruction Following SFT 1.0 (eval)
SFT Score59.4
6
Instruction FollowingInstruction Following Clean
F1 Score72.79
4
Instruction FollowingInstruction Following Poisoned
F1 Score99.48
2
Showing 9 of 9 rows