TY - GEN
T1 - Cascaded Tracking via Pyramid Dense Capsules
AU - Ma, Ding
AU - Wu, Xiangqian
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - The tracking-by-detection is a two-stage framework including, collecting the candidates around the target object and classifying each candidate as the target object or as background. Despite Convolutional Neural Networks (CNNs) based methods have been successful in tracking-by-detection framework, the own set of flaws of CNNs will still affect the performance. The underlying mechanism of CNNs that are based on the positional invariance (i.e., lose the spatial relationships between features) cannot capture the small affine transformations. This would ultimately result in drift. To solve this problem, we dig into spatial relationships endowed by the Capsule Networks (CapsNets) for tracking-by-detection framework. To strengthen the encoded power of convolutional capsules, we generate the convolutional capsules through a pyramid dense capsules (PDCaps) architecture. Our pyramid dense capsule representation is useful in producing comprehensive spatial relationships within the input. Besides, the critical challenges in the tracking-by-detection framework are how to avoid overfitting and mismatch during training and inference, where a reasonable intersection over union (IoU) threshold that defines the true/false positives is hard to set. To address the issue of the IoU threshold setting, a cascaded PDCaps model is proposed to improve the quality of candidates, and it consists of a sequential PDCaps model trained with increasing IoU thresholds to improve the quality of candidates sequentially. Extensive experiments demonstrate that our tracker performs favorably against state-of-the-art approaches.
AB - The tracking-by-detection is a two-stage framework including, collecting the candidates around the target object and classifying each candidate as the target object or as background. Despite Convolutional Neural Networks (CNNs) based methods have been successful in tracking-by-detection framework, the own set of flaws of CNNs will still affect the performance. The underlying mechanism of CNNs that are based on the positional invariance (i.e., lose the spatial relationships between features) cannot capture the small affine transformations. This would ultimately result in drift. To solve this problem, we dig into spatial relationships endowed by the Capsule Networks (CapsNets) for tracking-by-detection framework. To strengthen the encoded power of convolutional capsules, we generate the convolutional capsules through a pyramid dense capsules (PDCaps) architecture. Our pyramid dense capsule representation is useful in producing comprehensive spatial relationships within the input. Besides, the critical challenges in the tracking-by-detection framework are how to avoid overfitting and mismatch during training and inference, where a reasonable intersection over union (IoU) threshold that defines the true/false positives is hard to set. To address the issue of the IoU threshold setting, a cascaded PDCaps model is proposed to improve the quality of candidates, and it consists of a sequential PDCaps model trained with increasing IoU thresholds to improve the quality of candidates sequentially. Extensive experiments demonstrate that our tracker performs favorably against state-of-the-art approaches.
KW - Cascaded architecture
KW - Pyramid and dense capsules
KW - Visual tracking
UR - https://www.scopus.com/pages/publications/85101416587
U2 - 10.1007/978-3-030-68238-5_45
DO - 10.1007/978-3-030-68238-5_45
M3 - 会议稿件
AN - SCOPUS:85101416587
SN - 9783030682378
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 683
EP - 696
BT - Computer Vision – ECCV 2020 Workshops, Proceedings
A2 - Bartoli, Adrien
A2 - Fusiello, Andrea
PB - Springer Science and Business Media Deutschland GmbH
T2 - Workshops held at the 16th European Conference on Computer Vision, ECCV 2020
Y2 - 23 August 2020 through 28 August 2020
ER -