Skip to main navigation Skip to search Skip to main content

SASENet: multimodal 3D object detection for Gm-APD LiDAR based on semantic and spatial enhancement

  • Yuanxue Ding
  • , Dongyang Liu
  • , Yanchen Qu
  • , Dakuan Du
  • , Guanlin Chen
  • , Xuefeng Dong
  • , Jianfeng Sun*
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Xi'an Modern Control Technology Research Institute
  • School of Electronics and Information Engineering, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Three-dimensional (3D) object detection in point clouds, a critical component of intelligent perception, has attracted considerable research attention. However, the sparsity and lack of semantic information in point clouds generated by long-range Geiger-mode avalanche photodiode (Gm-APD) LiDAR pose significant challenges, as unimodal detection struggles to distinguish structurally similar objects. To address this limitation, we propose SASENet, a multimodal 3D object detection network that integrates semantic and spatial enhancements. Specifically, at the input stage, we introduce a Semantic Spatial Enhancement Module (SSEM). Horizontally, we align the interpolated Gm-APD LiDAR range image with the infrared image and generate semantically enhanced point clouds through semantic segmentation of the infrared image. Vertically, we upsample the sparse point clouds to obtain semantic-spatially enhanced point clouds, enriching their structural information. At the feature interaction stage, we propose a Bidirectional Feature Interaction Module (BFIM) based on a dual-stream architecture, which enhances cross-modal semantic correlations by enabling bidirectional interactions between infrared image features and LiDAR point cloud features. Extensive experiments demonstrate that SASENet achieves competitive performance on our self-constructed dataset, particularly excelling in long-range 3D object detection.

Original languageEnglish
Article number106145
JournalInfrared Physics and Technology
Volume151
DOIs
StatePublished - Dec 2025

Keywords

  • Gm-APD LiDAR
  • Multimodal 3D object detection
  • Semantic enhancement
  • Spatial enhancement

Fingerprint

Dive into the research topics of 'SASENet: multimodal 3D object detection for Gm-APD LiDAR based on semantic and spatial enhancement'. Together they form a unique fingerprint.

Cite this