TY - GEN
T1 - Real-Time Pixel-Wise Grasp Detection Based on RGB-D Feature Dense Fusion
AU - Wu, Yongxiang
AU - Fu, Yili
AU - Wang, Shuguo
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/8/8
Y1 - 2021/8/8
N2 - This paper presents a real-time fully convolutional network for detecting grasp pose and confidence of each pixel in RGB-D images. Instead of processing RGB-D data equally, we transform the depth image into point cloud and use a heterogeneous architecture to embed and densely fuse RGB-D information into semantically rich features. To improve the computational efficiency, we propose and integrate a novel point sampling and matching mechanism into the dense fusion. A proposed Uniform Index Sampling (UIS) algorithm is used to sample points uniformly and quickly, and corresponding color and geometry features are matched via a designed Index Image, which is also used for the consistent transformation of RGB-D data. By making full use of RGB-D information effectively, our model achieves a better accuracy of 99.1% on Cornell dataset and 96.4% on Jacquard dataset than current state-of-the-art methods. Moreover, benefiting from the efficient point sampling and matching mechanism, our methods runs at a real-time speed of 8 millisecond per frame. The proposed method is robust for physical grasping and achieves a success rate of 97% on household set, 90% on adversarial set and 91% when grasping in clutter.
AB - This paper presents a real-time fully convolutional network for detecting grasp pose and confidence of each pixel in RGB-D images. Instead of processing RGB-D data equally, we transform the depth image into point cloud and use a heterogeneous architecture to embed and densely fuse RGB-D information into semantically rich features. To improve the computational efficiency, we propose and integrate a novel point sampling and matching mechanism into the dense fusion. A proposed Uniform Index Sampling (UIS) algorithm is used to sample points uniformly and quickly, and corresponding color and geometry features are matched via a designed Index Image, which is also used for the consistent transformation of RGB-D data. By making full use of RGB-D information effectively, our model achieves a better accuracy of 99.1% on Cornell dataset and 96.4% on Jacquard dataset than current state-of-the-art methods. Moreover, benefiting from the efficient point sampling and matching mechanism, our methods runs at a real-time speed of 8 millisecond per frame. The proposed method is robust for physical grasping and achieves a success rate of 97% on household set, 90% on adversarial set and 91% when grasping in clutter.
KW - deep learning
KW - feature fusion
KW - Grasp detection
KW - grasping
UR - https://www.scopus.com/pages/publications/85115185871
U2 - 10.1109/ICMA52036.2021.9512605
DO - 10.1109/ICMA52036.2021.9512605
M3 - 会议稿件
AN - SCOPUS:85115185871
T3 - 2021 IEEE International Conference on Mechatronics and Automation, ICMA 2021
SP - 970
EP - 975
BT - 2021 IEEE International Conference on Mechatronics and Automation, ICMA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE International Conference on Mechatronics and Automation, ICMA 2021
Y2 - 8 August 2021 through 11 August 2021
ER -