Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
UAV Code Generation on PX4 controller in Gazebo
Loading...
93.3
SR
Semantic
22.268
40.709
59.15
77.591
Dec 1, 2025
SR
Completeness
Updated 1mo ago
Evaluation Results
Method
Method
Links
SR
Completeness
Semantic
LLM Model=o4-mini
2025.12
93.3
98.6
LLM-driven corrective robot operation code generation framework
LLM Model=o4-mini
2025.12
93.3
96.9
Semantic
LLM Model=o3-mini
2025.12
88.3
98.3
LLM-driven corrective robot operation code generation framework
LLM Model=o3-mini
2025.12
86.7
97.7
Numerical
LLM Model=o4-mini
2025.12
86.7
96.9
Numerical
LLM Model=o3-mini
2025.12
83.3
97.1
Direct
LLM Model=o3-mini
2025.12
55
77.5
Direct
LLM Model=o4-mini
2025.12
25
50.9
Feedback
Search any
task
Search any
task