Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Predicate Grounding on ProDG Cooking
Loading...
100
F1 Score
Symbolizer
67.76
76.13
84.5
92.87
Apr 20, 2026
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
Symbolizer
Model=Gem.3.1-Pro
2026.04
100
Symbolizer
Model=Gem.3.1-FL
2026.04
100
SYMBOLIZER
Backbone=Gemini 3.1 Pro
2026.04
100
SYMBOLIZER
Backbone=Gemini 3.1 Fl...
2026.04
100
SYMBOLIZER
Backbone=Gem.3.1-Pro
2026.04
100
SYMBOLIZER
Backbone=Gem.3.1-FL
2026.04
100
SYMBOLIZER
Backbone=Mistral Small
2026.04
99.6
Symbolizer
Model=Mistral Small
2026.04
96.4
SYMBOLIZER
Backbone=Mistral Small
2026.04
96.2
ViLaIn
Backbone=GPT-4, Retry...
2026.04
96
ViLaIn
Backbone=GPT-4, Retry...
2026.04
96
ViLaIn
Backbone=Gemini 3.1 Fl...
2026.04
95.7
ViLaIn
Backbone=Gemini 3.1 Fl...
2026.04
95.7
ViLaIn
Backbone=GPT-4, Retry...
2026.04
93
ViLaIn
Backbone=Gem. 3.1-FL,...
2026.04
92.3
ViLaIn
Backbone=Gem. 3.1-FL,...
2026.04
92.3
ViLaIn
Model=GPT-4, Retries=t...
2026.04
91
ViLaIn
Backbone=GPT-4, Retry...
2026.04
88
ViLaIn
Model=Gem. 3.1-FL, Ret...
2026.04
84.6
ViLaIn
Model=Gem. 3.1-FL, Ret...
2026.04
84.6
ViLaIn
Model=GPT-4, Retries=n...
2026.04
69
Feedback
Search any
task
Search any
task