Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Instruction Following on IFBench (Accuracy)

67.77Accuracy

gpt-oss-puzzle-88B

15.936429.393242.8556.3068Nov 28, 2025Dec 11, 2025Dec 24, 2025Jan 6, 2026Jan 19, 2026Feb 1, 2026Feb 14, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
67.77
2026.02
67.01
2026.02
65.76
2026.02
65.56
2026.02
64.46
2026.02
63.35
2026.02
58.67
2026.02
56.55
2026.02
56.38
2026.02
56.21
2026.02
44.13
2026.02
43.71
2025.11
30.28
2025.11
25.85
2025.11
23.53
2025.11
22.19
2025.11
20.78
2026.02
20.5
2026.02
20.1
2026.02
19.8
2026.02
19.6
2026.02
19.5
2026.02
18.8
2025.11
18.37
2025.11
17.93