Local Gain Adaptation in Stochastic Gradient Descent

N. N. Schraudolph. Local Gain Adaptation in Stochastic Gradient Descent. In Proc. Intl. Conf. Artificial Neural Networks (ICANN), pp. 569–574, IEE, London, Edinburgh, Scotland, 1999.

Download

pdf djvu ps.gz
354.2kB   95.8kB   209.4kB  

Abstract

Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive gradients. Here we discuss the limitations of this approach, and develop an alternative by extending Sutton's work on linear systems to the general, nonlinear case. The resulting online algorithms are computationally little more expensive than other acceleration techniques, do not assume statistical independence between successive training patterns, and do not require an arbitrary smoothing parameter. In our benchmark experiments, they consistently outperform other acceleration methods, and show remarkable robustness when faced with non-i.i.d. sampling of the input space.

BibTeX Entry

@inproceedings{Schraudolph99b,
     author = {Nicol N. Schraudolph},
      title = {\href{http://nic.schraudolph.org/pubs/Schraudolph99b.pdf}{
               Local Gain Adaptation in Stochastic Gradient Descent}},
      pages = {569--574},
  booktitle =  icann,
    address = {Edinburgh, Scotland},
  publisher = {IEE, London},
       year =  1999,
   b2h_type = {Top Conferences},
  b2h_topic = {>Stochastic Meta-Descent},
   abstract = {
    Gain adaptation algorithms for neural networks typically adjust learning
    rates by monitoring the correlation between successive gradients.  Here
    we discuss the limitations of this approach, and develop an alternative 
    by extending Sutton's work on linear systems to the general, nonlinear
    case.  The resulting online algorithms are computationally little more
    expensive than other acceleration techniques, do not assume statistical
    independence between successive training patterns, and do not require
    an arbitrary smoothing parameter.  In our benchmark experiments, they
    consistently outperform other acceleration methods, and show remarkable
    robustness when faced with non-i.i.d. sampling of the input space.
}}

Generated by bib2html.pl (written by Patrick Riley) on Thu Sep 25, 2014 12:00:33