Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
UAV Corrective Code Generation on AirSim simple_flight
Loading...
90
SR
LLM-driven corrective robot operation code generation framework
31.032
46.341
61.65
76.959
Dec 1, 2025
SR
Completeness
Updated 1mo ago
Evaluation Results
Method
Method
Links
SR
Completeness
LLM-driven corrective robot operation code generation framework
LLM Model=o4-mini
2025.12
90
98.3
Semantic
LLM Model=o4-mini
2025.12
88.3
98.1
Semantic
LLM Model=o3-mini
2025.12
85
98.5
LLM-driven corrective robot operation code generation framework
LLM Model=o3-mini
2025.12
85
97
Numerical
LLM Model=o4-mini
2025.12
81.7
96.1
Numerical
LLM Model=o3-mini
2025.12
73.3
92.4
Direct
LLM Model=o3-mini
2025.12
43.3
70.9
Direct
LLM Model=o4-mini
2025.12
33.3
57.6
Feedback
Search any
task
Search any
task