Abstract
With the system performance, volume and power restriction requirements in edge computing, single chip based on Field Programmable Gate Array (FPGA), with the characteristics of parallel execution, flexible configuration and power efficiency, is more desirable for realizing Convolutional Neural Network (CNN) acceleration. However, implementing a lightweight CNN with limited on-chip resources while maintaining high computing efficiency and utilization is still a challenging task. To achieve efficient acceleration with single chip, we implement Network-on-Chip (NoC) based on Processing Element (PE) that consists of multiple node arrays. Moreover, the computing and memory efficiencies of PE are optimized with a sharing function and hybrid memory. To maximize resource utilization, a theoretical model is constructed to explore the parallel parameters and running cycles of each PE. In the experimental results of LeNet and MobileNet, resource utilization values of 83.61% and 95.28% are achieved, where the throughput values are 53.3 Giga Operations Per Second (GOPS) and 41.9 GOPS, respectively. Power measurements show that the power efficiency is optimized to 77.25 GOPS/W and 85.51 GOPS/W on our platform, which is sufficient to realize efficient inference for edge computing.
| Original language | English |
|---|---|
| Pages (from-to) | 13867-13881 |
| Number of pages | 15 |
| Journal | Applied Intelligence |
| Volume | 53 |
| Issue number | 11 |
| DOIs | |
| State | Published - Jun 2023 |
| Externally published | Yes |
Keywords
- Acceleration architecture
- CNN
- Efficient inference
- FPGA
Fingerprint
Dive into the research topics of 'An efficient lightweight CNN acceleration architecture for edge computing based-on FPGA'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver