In this paper, we show that parameters of a neural network can have redundancy in their ranks, both theoretically and empirically. When viewed as a function from one space to another, neural networks can exhibit feature correlation and slower training due to this redundancy. Motivated by this, we propose a novel regularization method to reduce the redundancy in the rank of parameters. It is a combination of an objective function that makes the parameter rank-deficient and a dynamic low-rank factorization algorithm that gradually reduces the size of this parameter by fusing linearly dependent vectors together. This regularization-by-pruning approach leads to a neural network with better training dynamics and fewer trainable parameters. We also present experimental results that verify our claims. When applied to a neural network trained to classify images, this method provides statistically significant improvement in accuracy and 7.1 times speedup in terms of number of steps required for training. Furthermore, this approach has the side benefit of reducing the network size, which led to a model with 30.65% fewer trainable parameters.
Bibliographical noteFunding Information:
This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01405) supervised by the IITP (Institute for Information and Communications Technology Planning and Evaluation). Additionally, it was also supported by the MSIT (Ministry of Science and ICT), Korea, under the ICT Creative Consilience program(IITP-2020-0-01819) supervised by the IITP (Institute for Information and communications Technology Planning and Evaluation).
© 2021 by the authors. Li-censee MDPI, Basel, Switzerland.
- Matrix rank
- Neural network
ASJC Scopus subject areas
- General Materials Science
- General Engineering
- Process Chemistry and Technology
- Computer Science Applications
- Fluid Flow and Transfer Processes