TY - GEN
T1 - Unveiling the Depths
T2 - 2025 IEEE International Conference on Robotics and Automation, ICRA 2025
AU - Xu, Jialei
AU - Li, Rui
AU - Cheng, Kai
AU - Jiang, Junjun
AU - Liu, Xianming
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Monocular depth estimation from RGB images plays a pivotal role in 3D vision. However, its accuracy can deteriorate in challenging environments such as nighttime or adverse weather conditions. While long-wave infrared cameras offer stable imaging in such challenging conditions, they are inherently low-resolution, lacking rich texture and semantics as delivered by the RGB image. Current methods focus solely on a single modality due to the difficulties to identify and integrate faithful depth cues from both sources. To address these issues, this paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework. Concretely, we independently compute the coarse depth maps with separate networks by fully utilizing the individual depth cues from each modality. As the advantageous depth spreads across both modalities, we propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas. With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner. Harnessing the proposed pipeline, our method demonstrates the ability of robust depth estimation in a variety of difficult scenarios. Experimental results on the challenging MS2 and ViViD++ datasets demonstrate the effectiveness and robustness of our method.
AB - Monocular depth estimation from RGB images plays a pivotal role in 3D vision. However, its accuracy can deteriorate in challenging environments such as nighttime or adverse weather conditions. While long-wave infrared cameras offer stable imaging in such challenging conditions, they are inherently low-resolution, lacking rich texture and semantics as delivered by the RGB image. Current methods focus solely on a single modality due to the difficulties to identify and integrate faithful depth cues from both sources. To address these issues, this paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework. Concretely, we independently compute the coarse depth maps with separate networks by fully utilizing the individual depth cues from each modality. As the advantageous depth spreads across both modalities, we propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas. With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner. Harnessing the proposed pipeline, our method demonstrates the ability of robust depth estimation in a variety of difficult scenarios. Experimental results on the challenging MS2 and ViViD++ datasets demonstrate the effectiveness and robustness of our method.
UR - https://www.scopus.com/pages/publications/105016700279
U2 - 10.1109/ICRA55743.2025.11127354
DO - 10.1109/ICRA55743.2025.11127354
M3 - 会议稿件
AN - SCOPUS:105016700279
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 6283
EP - 6290
BT - 2025 IEEE International Conference on Robotics and Automation, ICRA 2025
A2 - Ott, Christian
A2 - Admoni, Henny
A2 - Behnke, Sven
A2 - Bogdan, Stjepan
A2 - Bolopion, Aude
A2 - Choi, Youngjin
A2 - Ficuciello, Fanny
A2 - Gans, Nicholas
A2 - Gosselin, Clement
A2 - Harada, Kensuke
A2 - Kayacan, Erdal
A2 - Kim, H. Jin
A2 - Leutenegger, Stefan
A2 - Liu, Zhe
A2 - Maiolino, Perla
A2 - Marques, Lino
A2 - Matsubara, Takamitsu
A2 - Mavromatti, Anastasia
A2 - Minor, Mark
A2 - O'Kane, Jason
A2 - Park, Hae Won
A2 - Park, Hae-Won
A2 - Rekleitis, Ioannis
A2 - Renda, Federico
A2 - Ricci, Elisa
A2 - Riek, Laurel D.
A2 - Sabattini, Lorenzo
A2 - Shen, Shaojie
A2 - Sun, Yu
A2 - Wieber, Pierre-Brice
A2 - Yamane, Katsu
A2 - Yu, Jingjin
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 May 2025 through 23 May 2025
ER -