Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following and Safety Alignment on AlpacaEval Benign (n=50)
Loading...
97
WinRate
Best-of-N
92.84
93.92
95
96.08
Oct 10, 2025
WinRate
Llama-Guard P(unsafe)
Updated 1d ago
Evaluation Results
Method
Method
Links
WinRate
Llama-Guard P(unsafe)
Best-of-N
Generator=GPT-OSS-20B
2025.10
97
0.0145
Threshold filter
Generator=GPT-OSS-20B
2025.10
95
0.0099
SG
Generator=GPT-OSS-20B
2025.10
93
0.0036
Feedback
Search any
task
Search any
task