Abstract
While recent cutting-edge deep neural network (DNN) models, such as large language models (LLMs), demonstrate remarkable capabilities, their inherent dense data characteristics limit the performance and energy gains achievable through sparse acceleration. In this paper, we introduce the iSPADE architecture, which sparsifies end-to-end execution of dense DNNs to directly adapt the advantages of sparse acceleration without applying accuracy-sensitive techniques such as pruning. First, we propose inverted-bit representation to eliminate repetitive sign bits in 2's complement representation. Leveraging the inverted-bit representation that generates a significant number of zero bits, we propose data packing and computation skipping techniques to reduce both redundant data movement and computation. Finally, we present an iSPADE bit-slice hardware architecture that efficiently supports and accelerates the proposed sparse dataflow. In the evaluation results, we assess performance across general DNN workloads using 8 popular DNNs. iSPADE achieves 4.1X and 4.5X improvements in energy efficiency and speedup, respectively, over the previous state-of-the-art bit-slice accelerators, and it realizes a 1.7X reduction in memory footprint.
Original language | English |
---|---|
Title of host publication | Proceedings of the 29th International Symposium on Low Power Electronics and Design, ISLPED 2024 |
Publisher | Association for Computing Machinery, Inc |
ISBN (Electronic) | 9798400706882 |
DOIs | |
Publication status | Published - 2024 Aug 5 |
Event | 29th ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED 2024 - Newport Beach, United States Duration: 2024 Aug 5 → 2024 Aug 7 |
Publication series
Name | Proceedings of the 29th International Symposium on Low Power Electronics and Design, ISLPED 2024 |
---|
Conference
Conference | 29th ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED 2024 |
---|---|
Country/Territory | United States |
City | Newport Beach |
Period | 24/8/5 → 24/8/7 |
Bibliographical note
Publisher Copyright:© 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
Keywords
- binary representation
- deep neural network
- sparse acceleration
ASJC Scopus subject areas
- Electrical and Electronic Engineering