使用torch實現RNN

（本文對的結果進行了復現。）

在實驗室的專案遇到了困難，弄不明白lstm的原理。到網上搜尋，發現lstm是rnn的變種，那就從rnn開始學吧。

帶隱藏狀態的rnn可以用下面兩個公式來表示：

可以看出，乙個rnn的引數有w_xh，w_hh，b_h，w_hq，b_q和h(t)。其中h(t)是步數的函式。

參考的文章考慮了這樣乙個問題，對於x軸上的一列點，有一列sin值，我們想知道它對應的cos值，但是即使sin值相同，cos值也不同，因為輸出結果不僅依賴於當前的輸入值sinx，還依賴於之前的sin值。這時候可以用rnn來解決問題

用到的核心函式：torch.nn.rnn() 引數如下：

下面是**：

1
#encoding:utf-8
2import
torch
3import
numpy as np
4import matplotlib.pyplot as plt  #
匯入作圖相關的包
5from torch importnn6
78#定義rnn模型
9class
rnn(nn.module):
10def
__init__
(self, input_size):
11         super(rnn, self).__init__
()1213#
定義rnn網路,輸入單個數字.隱藏層size為[feature, hidden_size]
14         self.rnn =nn.rnn(
15                 input_size=input_size,
16                 hidden_size=32,
17                 num_layers=1,
18                 batch_first=true  #
注意這裡用了batch_first=true 所以輸入形狀為[batch_size, time_step, feature]19)
20#定義乙個全連線層,本質上是令rnn網路得以輸出
21         self.out = nn.linear(32, 1)
2223
#定義前向傳播函式
24def
forward(self, x, h_state):25#
給定乙個序列x,每個x.size=[batch_size, feature].同時給定乙個h_state初始狀態,rnn網路輸出結果並同時給出隱藏層輸出
26         r_out, h_state =self.rnn(x, h_state)
27         outs =
28for time in range(r_out.size(1)):  #
r_out.size=[1,10,32]即將乙個長度為10的序列的每個元素都對映到隱藏層上.
依次抽取序列中每個單詞,將之通過全連線層並輸出.r_out[:, 0, :].size()=[1,32] -> [1,1]
30return torch.stack(outs, dim=1), h_state  #
stack函式在dim=1上疊加:10*[1,1] -> [1,10,1] 同時h_state已經被更新
3132
33 time_step = 10
34 input_size = 1
35 lr = 0.02
3637 model =rnn(input_size)
38print
(model)
3940 loss_func = nn.mseloss()  #
使用均方誤差函式
41 optimizer = torch.optim.adam(model.parameters(), lr=lr)  #
使用adam演算法來優化rnn的引數,包括乙個nn.rnn層和nn.linear層
4243 h_state = none  #
初始化h_state為none
4445
for step in range(300):46#
人工生成輸入和輸出,輸入x.size=[1,10,1],輸出y.size=[1,10,1]
47     start, end = step * np.pi, (step + 1)*np.pi
4849     steps = np.linspace(start, end, time_step, dtype=np.float32)
50     x_np =np.sin(steps)
51     y_np =np.cos(steps)
5253     x =torch.from_numpy(x_np[np.newaxis, :, np.newaxis])
54     y =torch.from_numpy(y_np[np.newaxis, :, np.newaxis])
5556
#將x通過網路,長度為10的序列通過網路得到最終隱藏層狀態h_state和長度為10的輸出prediction:[1,10,1]
57     prediction, h_state =model(x, h_state)
58     h_state = h_state.data  #
這一步只取了h_state.data.因為h_state包含.data和.grad 捨棄了梯度59#
反向傳播
60     loss =loss_func(prediction, y)
61optimizer.zero_grad()
62loss.backward()
6364
#優化網路引數具體應指w_xh, w_hh, b_h.以及w_hq, b_q
65optimizer.step()
6667
#對最後一次的結果作圖檢視網路的**效果
68 plt.plot(steps, y_np.flatten(), 'r-'
)69 plt.plot(steps, prediction.data.numpy().flatten(), 'b-'
)70 plt.show()

最後一步**和實際y的結果作圖如下：

可看出，訓練rnn網路之後，對網路輸入乙個序列sinx，能正確輸出對應的序列cosx

深度學習筆記4torch的rnn

本節位址 require rnn require gnuplot batchsize 8 mini batch rho 100 back propagation through time hiddensize 20 r nn.recurent hiddensize,nn.linear 1,hidd...

tensorflow實現普通RNN

coding utf 8 author zhangxianke file test.py time 2018 11 09 from tensorflow.examples.tutorials.mnist import input data import tensorflow as tf data i...

使用tflearn 構建RNN

coding utf 8 created on 2018 1 19 import numpy as np import tflearn import tflearn.datasets.mnist as mnist x,y,testx,testy mnist.load data one hot tru...

使用torch實現RNN

深度學習筆記4torch的rnn

tensorflow實現普通RNN

使用tflearn 構建RNN

相關推薦