Efficient backprop

Yann A. LeCun, Léon Bottou, Genevieve B. Orr, Klaus Robert Müller

    Research output: Chapter in Book/Report/Conference proceedingChapter

    1664 Citations (Scopus)

    Abstract

    The convergence of back-propagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers explanations of why they work. Many authors have suggested that second-order optimization methods are advantageous for neural net training. It is shown that most "classical" second-order methods are impractical for large neural networks. A few methods are proposed that do not have these limitations.

    Original languageEnglish
    Title of host publicationNeural Networks
    Subtitle of host publicationTricks of the Trade
    PublisherSpringer Verlag
    Pages9-48
    Number of pages40
    ISBN (Print)9783642352881
    DOIs
    Publication statusPublished - 2012

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume7700 LECTURE NO
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Efficient backprop'. Together they form a unique fingerprint.

    Cite this