Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Predicate Grounding on Kitchen-Worlds
Loading...
100
F1 Score
SYMBOLIZER
-3.272
23.539
50.35
77.161
Apr 20, 2026
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
SYMBOLIZER
Backbone=Gemini 3.1 Pro
2026.04
100
SYMBOLIZER
Backbone=Mistral Small
2026.04
100
Symbolizer
Model=Gem.3.1-Pro
2026.04
85.1
SYMBOLIZER
Backbone=Gemini 3.1 Fl...
2026.04
83.9
Symbolizer
Model=Mistral Small
2026.04
75.4
Symbolizer
Model=Gem.3.1-FL
2026.04
69.5
SYMBOLIZER
Backbone=Gem.3.1-Pro
2026.04
1
SYMBOLIZER
Backbone=Gem.3.1-FL
2026.04
0.8
SYMBOLIZER
Backbone=Mistral Small
2026.04
0.7
Feedback
Search any
task
Search any
task