Stable deep learning for time-series data

Preventing gradient explosions in gated recurrent units

Abstract

We propose a method to stabilize training of Recurrent Neural Networks (RNNs). The RNN is one of the most successful models to handle the time-series data in many applications such as speech recognition or machine translation. However, training of RNNs requires trial and error, and expertise since training of RNNs is difficult due to the gradient exploding problem. In this study, we focus on the Gated Recurrent Unit (GRU), which is one of the modern RNN models. We reveal the parameter point at which training of GRUs is disrupted by the gradient exploding problem and propose an algorithm to prevent the gradient from exploding. Our method can reduce time for trial and error, and does not require in-depth expertise to tune the hyper-parameters for training of GRU.

Photos

Poster

Please click the thumbnail image to open the full-size PDF file.

Presenters

Sekitoshi Kanai
Software Innovation Center

Yasutoshi Ida
Software Innovation Center

Yu Oya
Software Innovation Center

Yasuhiro Iida
Software Innovation Center

Oral Presentations：Eisaku Maeda (Director's Talk) | Tomoharu Iwata | Takuhiro Kaneko | Makio Kashino | Takashi G. Sato |
Exhibition：1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29
Prev | Next