深度運(yùn)用LSTM神經(jīng)網(wǎng)絡(luò)并與經(jīng)典時(shí)序模型對(duì)比
點(diǎn)擊上方“程序員大白”,選擇“星標(biāo)”公眾號(hào)
重磅干貨,第一時(shí)間送達(dá)

?作者 |?馮太濤
單位 | 上海理工大學(xué)
研究方向 | 概率論與數(shù)理統(tǒng)計(jì)










對(duì)于梯度消散(爆炸)的原理解釋



LSTM底層理論介紹







PS:也許初學(xué)者看到這么多符號(hào)會(huì)比較頭疼,但邏輯是從簡到復(fù)雜的,RNN 徹底理解有助于理解后面的深入模型。這里本人也省略了很多細(xì)節(jié),大體模型框架就是如此,對(duì)于理解模型如何工作已經(jīng)完全夠了。至于怎么想出來的以及更為詳細(xì)的推導(dǎo)過程,由于作者水平有限,可參考相關(guān) RNN 論文,可多交流學(xué)習(xí)!

建模預(yù)測存在“右偏移”怎么辦!




改進(jìn)模型輸出





最終代碼
from?keras.callbacks?import?LearningRateScheduler
from?sklearn.metrics?import?mean_squared_error
from?keras.models?import?Sequential
import?matplotlib.pyplot?as?plt
from?keras.layers?import?Dense
from?keras.layers?import?LSTM
from?keras?import?optimizers
import?keras.backend?as?K
import?tensorflow?as?tf
import?pandas?as?pd
import?numpy?as?np
plt.rcParams['font.sans-serif']=['SimHei']##中文亂碼問題!
plt.rcParams['axes.unicode_minus']=False#橫坐標(biāo)負(fù)號(hào)顯示問題!
###初始化參數(shù)
my_seed?=?369#隨便給出個(gè)隨機(jī)種子
tf.random.set_seed(my_seed)##運(yùn)行tf才能真正固定隨機(jī)種子
sell_data?=?np.array([2800,2811,2832,2850,2880,2910,2960,3023,3039,3056,3138,3150,3198,3100,3029,2950,2989,3012,3050,3142,3252,3342,3365,3385,3340,3410,3443,3428,3554,3615,3646,3614,3574,3635,3738,3764,3788,3820,3840,3875,3900,3942,4000,4021,4055])
num_steps?=?3##取序列步長
test_len?=?10##測試集數(shù)量長度
S_sell_data?=?pd.Series(sell_data).diff(1).dropna()##差分
revisedata?=?S_sell_data.max()
sell_datanormalization?=?S_sell_data?/?revisedata##數(shù)據(jù)規(guī)范化
##數(shù)據(jù)形狀轉(zhuǎn)換,很重要!!
def?data_format(data,?num_steps=3,?test_len=5):
????#?根據(jù)test_len進(jìn)行分組
????X?=?np.array([data[i:?i?+?num_steps]
??????????????????for?i?in?range(len(data)?-?num_steps)])
????y?=?np.array([data[i?+?num_steps]
??????????????????for?i?in?range(len(data)?-?num_steps)])
????train_size?=?test_len
????train_X,?test_X?=?X[:-train_size],?X[-train_size:]
????train_y,?test_y?=?y[:-train_size],?y[-train_size:]
????return?train_X,?train_y,?test_X,?test_y
transformer_selldata?=?np.reshape(pd.Series(sell_datanormalization).values,(-1,1))
train_X,?train_y,?test_X,?test_y?=?data_format(transformer_selldata,?num_steps,?test_len)
print('\033[1;38m原始序列維度信息:%s;轉(zhuǎn)換后訓(xùn)練集X數(shù)據(jù)維度信息:%s,Y數(shù)據(jù)維度信息:%s;測試集X數(shù)據(jù)維度信息:%s,Y數(shù)據(jù)維度信息:%s\033[0m'%(transformer_selldata.shape,?train_X.shape,?train_y.shape,?test_X.shape,?test_y.shape))
def?buildmylstm(initactivation='relu',ininlr=0.001):
????nb_lstm_outputs1?=?128#神經(jīng)元個(gè)數(shù)
????nb_lstm_outputs2?=?128#神經(jīng)元個(gè)數(shù)
????nb_time_steps?=?train_X.shape[1]#時(shí)間序列長度
????nb_input_vector?=?train_X.shape[2]#輸入序列
????model?=?Sequential()
????model.add(LSTM(units=nb_lstm_outputs1,?input_shape=(nb_time_steps,?nb_input_vector),return_sequences=True))
????model.add(LSTM(units=nb_lstm_outputs2,?input_shape=(nb_time_steps,?nb_input_vector)))
????model.add(Dense(64,?activation=initactivation))
????model.add(Dense(32,?activation='relu'))
????model.add(Dense(test_y.shape[1],?activation='tanh'))
????lr?=?ininlr
????adam?=?optimizers.adam_v2.Adam(learning_rate=lr)
????def?scheduler(epoch):##編寫學(xué)習(xí)率變化函數(shù)
????????#?每隔epoch,學(xué)習(xí)率減小為原來的1/10
????????if?epoch?%?100?==?0?and?epoch?!=?0:
????????????lr?=?K.get_value(model.optimizer.lr)
????????????K.set_value(model.optimizer.lr,?lr?*?0.1)
????????????print('lr?changed?to?{}'.format(lr?*?0.1))
????????return?K.get_value(model.optimizer.lr)
????model.compile(loss='mse',?optimizer=adam,?metrics=['mse'])##根據(jù)損失函數(shù)性質(zhì),回歸建模一般選用”距離誤差“作為損失函數(shù),分類一般選”交叉熵“損失函數(shù)
????reduce_lr?=?LearningRateScheduler(scheduler)
????###數(shù)據(jù)集較少,全參與形式,epochs一般跟batch_size成正比
????##callbacks:回調(diào)函數(shù),調(diào)取reduce_lr
????##verbose=0:非冗余打印,即不打印訓(xùn)練過程
????batchsize?=?int(len(sell_data)?/?5)
????epochs?=?max(128,batchsize?*?4)##最低循環(huán)次數(shù)128
????model.fit(train_X,?train_y,?batch_size=batchsize,?epochs=epochs,?verbose=0,?callbacks=[reduce_lr])
????return?model
def?prediction(lstmmodel):
????predsinner?=?lstmmodel.predict(train_X)
????predsinner_true?=?predsinner?*?revisedata
????init_value1?=?sell_data[num_steps?-?1]##由于存在步長關(guān)系,這里起始是num_steps
????predsinner_true?=?predsinner_true.cumsum()??##差分還原
????predsinner_true?=?init_value1?+?predsinner_true
????predsouter?=?lstmmodel.predict(test_X)
????predsouter_true?=?predsouter?*?revisedata
????init_value2?=?predsinner_true[-1]
????predsouter_true?=?predsouter_true.cumsum()??##差分還原
????predsouter_true?=?init_value2?+?predsouter_true
????#?作圖
????plt.plot(sell_data,?label='原始值')
????Xinner?=?[i?for?i?in?range(num_steps?+?1,?len(sell_data)?-?test_len)]
????plt.plot(Xinner,?list(predsinner_true),?label='樣本內(nèi)預(yù)測值')
????Xouter?=?[i?for?i?in?range(len(sell_data)?-?test_len?-?1,?len(sell_data))]
????plt.plot(Xouter,?[init_value2]?+?list(predsouter_true),?label='樣本外預(yù)測值')
????allpredata?=?list(predsinner_true)?+?list(predsouter_true)
????plt.legend()
????plt.show()
????return?allpredata
mymlstmmodel?=?buildmylstm()
presult?=?prediction(mymlstmmodel)
def?evaluate_model(allpredata):
????allmse?=?mean_squared_error(sell_data[num_steps?+?1:],?allpredata)
????print('ALLMSE:',allmse)
evaluate_model(presult)

總結(jié)
推薦閱讀
關(guān)于程序員大白
程序員大白是一群哈工大,東北大學(xué),西湖大學(xué)和上海交通大學(xué)的碩士博士運(yùn)營維護(hù)的號(hào),大家樂于分享高質(zhì)量文章,喜歡總結(jié)知識(shí),歡迎關(guān)注[程序員大白],大家一起學(xué)習(xí)進(jìn)步!
評(píng)論
圖片
表情

