Abstract
Convolutional neural networks (CNNs) have become continually deeper. With the increasing depth of CNNs, the invalid calculations caused by padding-zero operations, filling-zero operations and stride length (stride length>1) represent an increasing proportion of all calculations. To adapt to different CNNs and to eliminate the influences of padding-zero operations, filling-zero operations and stride length on the computational efficiency of the accelerator, we draw upon the computation pattern of CPUs to design an efficient and versatile CNN accelerator, LACS (Loading-Addressing-Computing-Storing). We reduce the amount of data movements between registers and the on-chip buffer from O( k× k ) to O(k) by a bypass buffer mechanism. Finally, we deploy LACS on a field-programmable gate array (FPGA) chip and analyze the factors that affect the computational efficiency of LACS. We also run popular CNNs on LACS. The results show that LACS achieves an extremely high computational efficiency, 98.51% when executing AlexNet and 99.66% when executing VGG-16, significantly exceeding state-of-the-art accelerators.
| Original language | English |
|---|---|
| Article number | 8944026 |
| Pages (from-to) | 6045-6059 |
| Number of pages | 15 |
| Journal | IEEE Access |
| Volume | 8 |
| DOIs | |
| State | Published - 2020 |
| Externally published | Yes |
Keywords
- Accelerator
- buffer mechanism
- convolutional neural networks (CNNs)
- field-programmable gate array (FPGA)
Fingerprint
Dive into the research topics of 'LACS: A High-Computational-Efficiency Accelerator for CNNs'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver