Demand-driven navigation (DDN) refers to identifying and locating objects based on implicit user needs, although in dynamic and uncertain scenarios where the locations of objects are unknown. Traditional data-driven methods rely on pre-collected data for model training and decision-making, which limits their ability to generalize in unseen scenarios. In this paper, we propose CogDDN, a framework that emulates the human attentional mechanism by selectively focusing on key objects crucial for fulfilling user demands. CogDDN incorporates a dual-process decision-making module, comprising a Heuristic Process (System-I) for fast and efficient decision-making, with an Analytic Process (System-II) that analyzes past errors, accumulates them in a knowledge base and continuously improves its performance. Chain of Thought (CoT) reasoning is employed to strengthen the decision-making process. Extensive closed-loop evaluations on the AI2Thor simulator with the ProcThor dataset demonstrate that CogDDN outperforms single-view camera-only methods by 15%, showing significant improvements in navigation accuracy and adaptability.
@inproceedings{huang2025cogddn,
title={Cogddn: A cognitive demand-driven navigation with decision optimization and dual-process thinking},
author={Huang, Yuehao and Liu, Liang and Lei, Shuangming and Ma, Yukai and Su, Hao and Mei, Jianbiao and Zhao, Pengxiang and Gu, Yaqing and Liu, Yong and Lv, Jiajun},
booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
pages={5237--5246},
year={2025}
}