Skip to main content

Vision-and-Language Navigation (VLN) for Multi-Robot Navigation

Primary supervisor

Chern Hong Lim

Vision-and-Language Navigation (VLN) has emerged as a promising paradigm for enabling robots to interpret natural language instructions and navigate complex environments. While most prior work focuses on single-agent navigation, this study extends VLN to multi-robot systems, addressing challenges of coordination, communication, and task allocation in dynamic settings. We propose a framework that integrates multi-modal perception (visual and linguistic cues) with distributed decision-making, allowing multiple robots to collaboratively execute navigation tasks guided by human instructions. A hierarchical control architecture is introduced, where a global planner interprets high-level language commands and allocates subtasks, while local agents leverage visual grounding and reinforcement learning to achieve fine-grained navigation. This research highlights the potential of VLN-driven multi-robot navigation for applications in search-and-rescue, warehouse automation, and infrastructure inspection, bridging the gap between human-centered instruction and autonomous multi-agent collaboration.