Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following Evaluation on IFEval (random subset of 50 prompts)
Loading...
85.9
Task Fidelity
DIRECTER
81.324
82.512
83.7
84.888
Mar 6, 2026
Task Fidelity
Text Quality
Updated 1mo ago
Evaluation Results
Method
Method
Links
Task Fidelity
Text Quality
DIRECTER
2026.03
85.9
4.36
Zero-shot
2026.03
84
4.36
PASTA
2026.03
81.5
4.17
Feedback
Search any
task
Search any
task