TY - GEN
T1 - APAC-Net
T2 - 9th International Conference on Intelligence Science and Big Data Engineering, IScIDE 2019
AU - Lin, Rui
AU - Lu, Yao
AU - Lu, Guangming
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - We propose an unsupervised novel method, Attention-Pixel and Attention-Channel Network (APAC-Net), for unsupervised monocular learning of estimating scene depth and ego-motion. Our model only utilizes monocular image sequences and does not need additional sensor information, such as IMU and GPS, for supervising. The attention mechanism is employed in APAC-Net to improve the networks’ efficiency. Specifically, three attention modules are proposed to adjust feature weights when training. Moreover, to minimum the effect of noise, which is produced in the reconstruction processing, the Image-reconstruction loss based on PSNR LPSNR is used to evaluation the reconstruction quality. In addition, due to the fail depth estimation of the objects closed to camera, the Temporal-consistency loss LTemp between adjacent frames and the Scale-based loss LScale among different scales are proposed. Experimental results showed APAC-Net can perform well in both the depth and ego-motion tasks, and it even behaved better in several items on KITTI and Cityscapes.
AB - We propose an unsupervised novel method, Attention-Pixel and Attention-Channel Network (APAC-Net), for unsupervised monocular learning of estimating scene depth and ego-motion. Our model only utilizes monocular image sequences and does not need additional sensor information, such as IMU and GPS, for supervising. The attention mechanism is employed in APAC-Net to improve the networks’ efficiency. Specifically, three attention modules are proposed to adjust feature weights when training. Moreover, to minimum the effect of noise, which is produced in the reconstruction processing, the Image-reconstruction loss based on PSNR LPSNR is used to evaluation the reconstruction quality. In addition, due to the fail depth estimation of the objects closed to camera, the Temporal-consistency loss LTemp between adjacent frames and the Scale-based loss LScale among different scales are proposed. Experimental results showed APAC-Net can perform well in both the depth and ego-motion tasks, and it even behaved better in several items on KITTI and Cityscapes.
KW - Attention mechanism
KW - Depth estimation
KW - Ego-motion estimation
UR - https://www.scopus.com/pages/publications/85077128654
U2 - 10.1007/978-3-030-36189-1_28
DO - 10.1007/978-3-030-36189-1_28
M3 - 会议稿件
AN - SCOPUS:85077128654
SN - 9783030361884
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 336
EP - 348
BT - Intelligence Science and Big Data Engineering. Visual Data Engineering - 9th International Conference, IScIDE 2019, Proceedings, Part 1
A2 - Cui, Zhen
A2 - Pan, Jinshan
A2 - Zhang, Shanshan
A2 - Xiao, Liang
A2 - Yang, Jian
PB - Springer
Y2 - 17 October 2019 through 20 October 2019
ER -