Quant-PIM: An Energy-efficient Processing-in-memory Accelerator for Layer-wise Quantized Neural Networks

Young Seo Lee, Eui Young Chung, Young Ho Gong, Sung Woo Chung

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Layer-wise quantized neural networks (QNNs), which adopt different precisions for weights or activations in a layer-wise manner, have emerged as a promising approach for embedded systems. The layer-wise QNNs deploy only required number of data bits for the computation (e.g., convolution of weights and activations), which in turn reduces computation energy compared to the conventional QNNs. However, the layer-wise QNNs still cause a large amount of energy in the conventional memory systems, since memory accesses are not optimized for the required precision of each layer. To address this problem, we propose Quant-PIM, an energy-efficient processing-in-memory (PIM) accelerator for layer-wise QNNs. Quant-PIM selectively reads only required data bits within a data word depending on the precision, by deploying the modified I/O gating logics in a 3D stacked memory. Thus, Quant-PIM significantly reduces energy consumption for memory accesses. In addition, Quant-PIM improves the performance of layer-wise QNNs. When the required precision is half of the weight (or activation) size or less, Quant-PIM reads two data blocks in a single read operation by exploiting the saved memory bandwidth from the selective memory access, thus providing higher compute-throughput. Our simulation results show that Quant-PIM reduces system energy by 39.1 50.4% compared to the PIM system with 16-bit quantized precision, without accuracy loss.

Original languageEnglish
JournalIEEE Embedded Systems Letters
DOIs
Publication statusAccepted/In press - 2021

Keywords

  • Bandwidth
  • Decoding
  • Energy consumption
  • Energy efficiency
  • Memory management
  • Processing-in-memory
  • Quantization (signal)
  • Through-silicon vias
  • accelerator
  • energy efficiency.
  • layer-wise quantization
  • quantized neural network

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Quant-PIM: An Energy-efficient Processing-in-memory Accelerator for Layer-wise Quantized Neural Networks'. Together they form a unique fingerprint.

Cite this