Machine Learning: LSTM

发表于 2022-01-07 阅读次数：本文字数： 844 阅读时长 ≈ 1 分钟

LSTM: Long Short Term Memory networks

cite Understanding-LSTMs Learning-Memory-Access-Patterns

Core Structure of LSTM

An LSTM is composed of a hidden state $h$ and a cell state $c$, along with input $i$, forget $f$ , and output gates $o$ that dictate what information gets stored and propagated to the next timestep.

注:[ ] is for concatenate 连接的意思

forgot or inherit: decide what cell state information get from before.
$$f_t=\sigma\left(W_f\cdot\left[h_{t-1},x_t\right]+b_f\right)$$
input new: decide what new information influence the cell state.
$$i_t=\sigma\left(W_i\cdot\left[h_{t-1},x_t\right]+b_i\right)$$
$$\widetilde{C_t}=\tanh\left(W_C\cdot\left[h_{t-1},x_t\right]+b_C\right)$$
calculate new cell state
$$C_t=f_t*C_{t-1}+i_t * \widetilde{C_t}$$
decide the output/hidden state which depends on the current cell state and current input.
$$o_t=\sigma\left(W_o\cdot\left[h_{t-1},x_t\right]+b_o\right)$$
$$h_t=o_t*\tanh(C_t)$$