Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Verifiable Instruction Following on IFEval (test)
Loading...
75.23
Prompt Loose Accuracy
LLAMA-3.1-TULU-3-8B-DPO
61.3876
64.9813
68.575
72.1687
Dec 18, 2025
Prompt Loose Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Prompt Loose Accuracy
LLAMA-3.1-TULU-3-8B-DPO
Base Model=Llama-3.1-8...
2025.12
75.23
QWEN2.5-7B-INSTRUCT
Model Size=7B, Type=In...
2025.12
73.01
LLAMA-3.1-8B-INSTRUCT
Model Size=8B, Type=In...
2025.12
71.72
STACKELBERGGDA-LEADER
Role=Leader
2025.12
71.71
GEMMA-2-9B-IT
Model Size=9B, Type=In...
2025.12
71.53
LLAMA-3.1-TULU-3-8B-SFT
Base Model=Llama-3.1-8...
2025.12
67.46
STACKELBERGGDA-FOLLOWER
Role=Follower
2025.12
61.92
Feedback
Search any
task
Search any
task