Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following on IF-Eval
Loading...
84.66
Accuracy
General Teacher
37.3712
49.6481
61.925
74.2019
Sep 29, 2025
Nov 7, 2025
Dec 17, 2025
Jan 26, 2026
Mar 7, 2026
Apr 16, 2026
May 26, 2026
Accuracy
Updated 6d ago
Evaluation Results
Method
Method
Links
Accuracy
General Teacher
2026.05
84.66
Base
backbone=DS-8B
2025.09
63.7
CaMOPD
2026.05
59.89
IPO
backbone=DS-8B
2025.09
56.2
RealSafe
backbone=DS-8B, alignm...
2025.09
54.7
SelecTKD
2026.05
49.17
Medical Teacher
2026.05
48.98
Relaxed OPD
2026.05
48.61
Vanilla MOPD
2026.05
48.06
Base Model
Backbone=Meta-Llama-3-...
2026.02
40.48
Numerical
Backbone=Meta-Llama-3-...
2026.02
39.93
Random
Backbone=Meta-Llama-3-...
2026.02
39.74
WIM Fixed Judge
Backbone=Meta-Llama-3-...
2026.02
39.56
WIM Changing Judge
Backbone=Meta-Llama-3-...
2026.02
39.19
Feedback
Search any
task
Search any
task