Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Procedural Planning on Macro Average Zero-shot
Loading...
69.7
Macro Accuracy (Zero-shot)
GPT-4o-mini
31.012
41.056
51.1
61.144
May 19, 2026
Macro Accuracy (Zero-shot)
Updated 14d ago
Evaluation Results
Method
Method
Links
Macro Accuracy (Zero-shot)
GPT-4o-mini
Model capacity=Frontie...
2026.05
69.7
Gemini-1.5-Flash
Model capacity=Frontie...
2026.05
69
RECIPE-RL 7B
Model configuration=7B
2026.05
46.1
Qwen2.5-7B (Base)
Model configuration=7B
2026.05
32.5
Feedback
Search any
task
Search any
task