Bi-AQUA: Bilateral Control-Based Imitation Learning for Underwater Robot Arms via Lighting-Aware Action Chunking with Transformers
About
Underwater robotic manipulation remains challenging because lighting variation, color attenuation, scattering, and reduced visibility can severely degrade visuomotor policies. We present Bi-AQUA, the first underwater bilateral control-based imitation learning framework for robot arms that explicitly models lighting within the policy. Bi-AQUA integrates transformer-based bilateral action chunking with a hierarchical lighting-aware design composed of a label-free Lighting Encoder, FiLM-based visual feature modulation, and a lighting token for action conditioning. This design enables adaptation to static and dynamically changing underwater illumination while preserving the force-sensitive advantages of bilateral control, which are particularly important in long-horizon and contact-rich manipulation. Real-world experiments on underwater pick-and-place, drawer closing, and peg extraction tasks show that Bi-AQUA outperforms a bilateral baseline without lighting modeling and achieves robust performance under seen, unseen, and changing lighting conditions. These results highlight the importance of combining explicit lighting modeling with force-aware bilateral imitation learning for reliable underwater manipulation. For additional material, please check: https://mertcookimg.github.io/bi-aqua
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Pick-&-Place | Pick-and-Place Untrained Lighting | Success Rate (Cyan)100 | 7 | |
| Pick-&-Place | Pick-and-Place Trained Lighting | Success Rate (Red Lighting)100 | 4 | |
| Pick-&-Place | Pick-and-Place Generalization Trained Lighting | Success Rate (Red)100 | 3 |