Entropy-Constrained Training of Deep Neural Networks

Simon Wiedemann, Arturo Marban, Klaus Robert Muller, Wojciech Samek

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)


Motivated by the Minimum Description Length (MDL) principle, we first derive an expression for the entropy of a neural network which measures its complexity explicitly in terms of its bit-size. Then, we formalize the problem of neural network compression as an entropy-constrained optimization objective. This objective generalizes many of the currently proposed compression techniques in the literature, in that pruning or reducing the cardinality of the weight elements can be seen as special cases of entropy reduction methods. Furthermore, we derive a continuous relaxation of the objective, which allows us to minimize it using gradient-based optimization techniques. Finally, we show that we can reach compression results, which are competitive with those obtained using state-of-the-art techniques, on different network architectures and data sets, e.g. achieving×71 compression gains on a VGG-like architecture.

Original languageEnglish
Title of host publication2019 International Joint Conference on Neural Networks, IJCNN 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728119854
Publication statusPublished - 2019 Jul
Event2019 International Joint Conference on Neural Networks, IJCNN 2019 - Budapest, Hungary
Duration: 2019 Jul 142019 Jul 19

Publication series

NameProceedings of the International Joint Conference on Neural Networks


Conference2019 International Joint Conference on Neural Networks, IJCNN 2019

Bibliographical note

Publisher Copyright:
© 2019 IEEE.


  • Neural network compression
  • cardinality reduction
  • entropy minimization
  • pruning

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence


Dive into the research topics of 'Entropy-Constrained Training of Deep Neural Networks'. Together they form a unique fingerprint.

Cite this