TY - JOUR
T1 - A system-level exploration of binary neural network accelerators with monolithic 3d based compute-in-memory sram
AU - Choi, Jeong Hwan
AU - Gong, Young Ho
AU - Chung, Sung Woo
N1 - Funding Information:
Funding: This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A2C2003500) and funded by the Ministry of Science and ICT for Original Technology Program (No. 2020M3F3A2A01082329). This work also has been conducted by the Research Grant of Kwangwoon University in 2020.
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - Binary neural networks (BNNs) are adequate for energy-constrained embedded systems thanks to binarized parameters. Several researchers have proposed the compute-in-memory (CiM) SRAMs for XNOR-and-accumulation computations (XACs) in BNNs by adding additional transistors to the conventional 6T SRAM, which reduce the latency and energy of the data movements. However, due to the additional transistors, the CiM SRAMs suffer from larger area and longer wires than the conventional 6T SRAMs. Meanwhile, monolithic 3D (M3D) integration enables fine-grained 3D integration, reducing the 2D wire length in small functional units. In this paper, we propose a BNN accelerator (BNN_Accel), composed of a 9T CiM SRAM (CiM_SRAM), input buffer, and global periphery logic, to execute the computations in the binarized convolution layers of BNNs. We also propose CiM_SRAM with the subarray-level M3D integration (as well as the transistor-level M3D integration), which reduces the wire latency and energy compared to the 2D planar CiM_SRAM. Across the binarized convolution layers, our simulation results show that BNN_Accel with the 4-layer CiM_SRAM reduces the average execution time and energy by 39.9% and 23.2%, respectively, compared to BNN_Accel with the 2D planar CiM_SRAM.
AB - Binary neural networks (BNNs) are adequate for energy-constrained embedded systems thanks to binarized parameters. Several researchers have proposed the compute-in-memory (CiM) SRAMs for XNOR-and-accumulation computations (XACs) in BNNs by adding additional transistors to the conventional 6T SRAM, which reduce the latency and energy of the data movements. However, due to the additional transistors, the CiM SRAMs suffer from larger area and longer wires than the conventional 6T SRAMs. Meanwhile, monolithic 3D (M3D) integration enables fine-grained 3D integration, reducing the 2D wire length in small functional units. In this paper, we propose a BNN accelerator (BNN_Accel), composed of a 9T CiM SRAM (CiM_SRAM), input buffer, and global periphery logic, to execute the computations in the binarized convolution layers of BNNs. We also propose CiM_SRAM with the subarray-level M3D integration (as well as the transistor-level M3D integration), which reduces the wire latency and energy compared to the 2D planar CiM_SRAM. Across the binarized convolution layers, our simulation results show that BNN_Accel with the 4-layer CiM_SRAM reduces the average execution time and energy by 39.9% and 23.2%, respectively, compared to BNN_Accel with the 2D planar CiM_SRAM.
KW - Binary neural network
KW - Compute-in-memory
KW - Energy efficiency
KW - Monolithic 3D integration
UR - http://www.scopus.com/inward/record.url?scp=85102126788&partnerID=8YFLogxK
U2 - 10.3390/electronics10050623
DO - 10.3390/electronics10050623
M3 - Article
AN - SCOPUS:85102126788
SN - 2079-9292
VL - 10
SP - 1
EP - 11
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 5
M1 - 623
ER -