Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Predicate Grounding on Search and Rescue domain
Loading...
100
F1 Score
GinSign
76.704
82.752
88.8
94.848
Dec 18, 2025
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
GinSign
2025.12
100
GPT-4o
2025.12
95.1
GPT-3.5 Turbo
2025.12
95
GPT-4o
2025.12
94.9
GPT-3.5 Turbo
2025.12
94
GinSign
2025.12
91.1
GPT-4.1 Mini
2025.12
90.9
GPT-4.1 Mini
2025.12
87.7
Lang2LTL
distinguishes_predicat...
2025.12
77.6
Feedback
Search any
task
Search any
task