Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Verifiable Instruction Following on IFEval (test)

75.23Prompt Loose Accuracy

LLAMA-3.1-TULU-3-8B-DPO

61.387664.981368.57572.1687Dec 18, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
75.23
2025.12
73.01
2025.12
71.72
2025.12
71.71
2025.12
71.53
2025.12
67.46
2025.12
61.92