Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Prompt Hygiene Evaluation on IFEval Hard 10-sample subset
Loading...
1.43
Readability (Human)
PREFPO
1.0868
1.1759
1.265
1.3541
Mar 13, 2026
Readability (Human)
Specification Quality (Human)
Maintainability (Human)
Total Score (Human)
Readability (LLM)
Specification Quality (LLM)
Maintainability (LLM)
Total Score (LLM)
Updated 26d ago
Evaluation Results
Method
Method
Links
Readability (Human)
Specification Quality (Human)
Maintainability (Human)
Total Score (Human)
Readability (LLM)
Specification Quality (LLM)
Maintainability (LLM)
Total Score (LLM)
PREFPO
2026.03
1.43
0.93
1.17
3.53
1.53
1.07
1.17
3.77
TextGrad
2026.03
1.1
0.7
0.8
2.6
1.23
0.63
0.9
2.77
Feedback
Search any
task
Search any
task