Skip to main navigation Skip to search Skip to main content

Multi-Scale Spiking Pyramid Wireless Communication Framework for Food Recognition

  • Harbin Institute of Technology
  • Wireless Technology Lab

Research output: Contribution to journalArticlepeer-review

Abstract

Food recognition applications in human health have recently garnered significant attention in the field of computer vision. With the advancement of mobile devices, robust food recognition in wireless communication has become a practical and challenging application scenario. We propose a novel Multi-scale Spiking Pyramid Transmission Network (MSPTN) to tackle this challenge. The MSPTN learns diverse and complementary local and global feature maps simultaneously, generating a comprehensive description of food images that capture the correlations of feed-specific features. The feature sender uses a three-layer Spiking Neural Network (SNN). The proposed sender compresses features into sparse and discrete spike trains, significantly reducing the required transmission bandwidth and improving channel utilization and energy efficiency. Our model introduces the Compressed Factorized Bilinear block (CFB), which employs a low-rank feature approximation to reduce computational complexity and feature transmission volume while preserving the discriminate features. The enhancement reasoning module is proposed to enhance the received features by projecting them into a higher-dimensional space and utilizing the self-attention mechanism and sum pooling to compress them back to the original dimension. We conduct extensive experiments on the ETH Food-101 and Food2k datasets. Our results reveal that the MSPTN demonstrates state-of-the-art recognition performance, even with binary spike trains. Meanwhile, the MSPTN also exhibits remarkable robustness in wireless communication scenarios. With the combination of CFB, SNN, and EFB, our model achieves significant efficiency gains, including a nearly nine-fold decrease in feature transmission volume and a three-fold improvement in runtime & computational memory speed.

Original languageEnglish
Pages (from-to)2734-2746
Number of pages13
JournalIEEE Transactions on Multimedia
Volume27
DOIs
StatePublished - 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being
  2. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Food recognition
  • low-rank approximation
  • spiking neural network
  • wireless communication

Fingerprint

Dive into the research topics of 'Multi-Scale Spiking Pyramid Wireless Communication Framework for Food Recognition'. Together they form a unique fingerprint.

Cite this