Optimizing TensorFlow performance by reconstructing the convolution routine

Minseong Kim, Kyu Hyun Choi, Yoonah Paik, Seon Wook Kim

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Using deep learning, we can currently build computational models composed of multiple processing layers to learn representations of data. Convolutional neural networks (CNNs) have been widely adopted to achieve significant performance in image recognition and classification. TensorFlow, an open-source deep learning framework from Google, uses profiling to select one convolution algorithm, from among several available, as the core of a CNN to deliver the best performance in terms of execution time and memory usage. However, the overhead from profiling is considerably significant, because TensorFlow executes and profiles all the available algorithms for the best selection whenever an application is launched. We observe that memory usage overshoots during profiling, which limits data parallelism, and thus, fails to deliver maximum performance. In this paper, we present a novel profiling method to reduce overhead by storing the profile result from the first run and reusing it from the second run on. Using Inception-V3, we achieved up to 1.12 times and 1.11 times higher throughput, compared to the vanilla TensorFlow and TensorFlow with XLA JIT compilation, respectively, without losing accuracy.

    Original languageEnglish
    Pages (from-to)128-135
    Number of pages8
    JournalIEIE Transactions on Smart Processing and Computing
    Volume10
    Issue number2
    DOIs
    Publication statusPublished - 2021 Apr

    Bibliographical note

    Funding Information:
    This work was partially supported by SK Telecom Co., LTD.

    Publisher Copyright:
    © 2021 The Institute of Electronics and Information Engineers

    Keywords

    • Batch
    • Optimization
    • Profiling
    • TensorFlow

    ASJC Scopus subject areas

    • Signal Processing
    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'Optimizing TensorFlow performance by reconstructing the convolution routine'. Together they form a unique fingerprint.

    Cite this