Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Procedural Planning on Macro Average Zero-shot

69.7Macro Accuracy (Zero-shot)

GPT-4o-mini

31.01241.05651.161.144May 19, 2026
Updated 14d ago

Evaluation Results

MethodLinks
2026.05
69.7
2026.05
69
2026.05
46.1
2026.05
32.5