GRU and LSTM

Compared with RNN, there are update gate and forget gate that modify hidden states.

LSTM

LSTM:

Forget gates control a (previous cell state) gets passed on to next cell state
Update gates control c (candidate value) add to next cell state
Output gates control prediction output

Difference between GRU and LSTM:

LSTM is better at addressing vanishing gradients, and better at carrying input for many timestep
GRU focus on local input (near the current timestep)

PreviousRecurrent Neural Networks NextNeural Turing Machines

Last updated 3 years ago

Was this helpful?