Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following on AlpacaEval LC 2.0
Loading...
78.61
AlpacaEval 2.0 LC Score
Gaussian time sampling
33.2556
45.0303
56.805
68.5797
Mar 6, 2025
May 17, 2025
Jul 28, 2025
Oct 8, 2025
Dec 19, 2025
Mar 1, 2026
May 13, 2026
AlpacaEval 2.0 LC Score
Standard Error (SE)
Updated 20d ago
Evaluation Results
Method
Method
Links
AlpacaEval 2.0 LC Score
Standard Error (SE)
Gaussian time sampling
Judge=GPT-4o, Evaluati...
2026.05
78.61
-
Llama-3.1-Nemotron-70B-Instruct
Feedback + Edit Protoc...
2025.03
62.8
1.3
Llama-3.1-Nemotron-70B-Instruct
Feedback + Edit Protoc...
2025.03
57.6
1.65
GPT-4o-2024-05-13
2025.03
57.5
1.47
Claude-3-5-Sonnet-20240620
2025.03
52.4
1.47
Llama-3.1-405B-Instruct
2025.03
39.3
1.43
Llama-3.1-70B-Instruct
2025.03
38.1
0.9
Llama-3.3-70B-Instruct
Feedback + Edit Protoc...
2025.03
36.9
1.5
Llama-3.3-70B-Instruct
Feedback + Edit Protoc...
2025.03
35
1.45
Feedback
Search any
task
Search any
task