Abstract
Recently, a large number of image compressive sensing (CS) methods with deep unfolding networks (DUNs) have been proposed. However, existing methods either use fixed-scale blocks for sampling that leads to limited insights into the image content or employ a plain convolutional neural network (CNN) in each iteration that weakens the perception of broader contextual prior. In this paper, we propose a novel DUN (dubbed SVASNet) for image compressive sensing, which achieves scale-variable adaptive sampling and hybrid-attention Transformer reconstruction with a single model. Specifically, for scale-variable sampling, a sampling matrix-based calculator is first employed to evaluate the reconstruction distortion, which only requires measurements without access to the ground truth image. Then, a Block Scale Aggregation (BSA) strategy is presented to compute the reconstruction distortion under block divisions at different scales and select the optimal division scale for sampling. To realize hybrid-attention reconstruction, a dual Cross Attention (CA) submodule in the gradient descent step and a Spatial Attention (SA) submodule in the proximal mapping step are developed. The CA submodule introduces inter-phase inertial forces in the gradient descent, which improves the memory effect between adjacent iterations. The SA submodule integrates local and global prior representations of CNN and Transformer, and explores local and global affinities between dense feature representations. Extensive experimental results show that the proposed SVASNet achieves significant improvements over the state-of-the-art methods.
| Original language | English |
|---|---|
| Pages (from-to) | 4333-4347 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Multimedia |
| Volume | 27 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Image compressive sensing
- deep unfolding network
- hybrid-attention transformer
- proximal gradient descent (PGD)
- scale-variable adaptive sampling
Fingerprint
Dive into the research topics of 'Image Compressive Sensing With Scale-Variable Adaptive Sampling and Hybrid-Attention Transformer Reconstruction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver