Abstract: As unmanned aerial vehicles (UAVs) become more prevalent in smart cities, their capacity for visual language navigation (VLN) is garnering increasing interest. VLN in cities has significant ...
1 University of Science and Technology of China 2 WeChat, Tencent Inc. 1. A Novel Parameter Space Alignment Paradigm Recent MLLMs follow an input space alignment paradigm that aligns visual features ...
Abstract: Visual target navigation is a critical capability for autonomous robots operating in unknown environments, particularly in human-robot interaction scenarios. While classical and ...