隨機梯度下降法

#自定義虛擬資料集
import numpy as np
import matplotlib.pyplot as plt
m=100000
#m個樣本
x=np.random.normal(size=m)
x=x.reshape(-1
,1)y=
4.*x+3
.+np.random.normal(0,
3,size=m)
defj
(theta,x_b,y)
:try
:return np.
sum(
(y-x_b.dot(theta)**2
)/len(x_b)
except
:return
float
('inf'
)def
dj_sgd
(theta,x_b_i,y_i)
:return x_b_i.t.dot(x_b_i.dot(theta)
-y_i)*2
.def
sgd(x_b,y,initial_theta,n_iters)
:	t0=
5	t1=
50def
learning_rate
(t):
retrun t0/
(t+t1)
theta=initial_theta
for cur_iter in
range
(n_iters)
:		rand_i=np.random.randint(
len(x_b)
)		gradient=dj_sgd(theta,x_b[rand_i]
,y[rand_i]
)		theta=theta-learning_rate(cur_iter)
*gradient
return theta
%%time
x_b=np.hstack(
[np.ones(
(len
(x),1)
),x]
)initial_theta=np.zeros(x_b.shape[1]
)theta=sgd(x_b,y,initial_theta,n_iters=
len(x_b)//3
)

執行結果：

可以看出隨機梯度下降的準確率和梯度下降差不多，但是時間大大減少了。實際上這裡的時間花費只有1/3m。

在sklearn中使用隨機梯度下降法

from sklearn.linear_model import sgdregressor
sgd_reg=sgdregressor(
)%time sgd_reg.fit(x_train_standard,y_train)
sgd_reg.score(x_test_standard,y_test)

梯度下降法和隨機梯度下降法

批量梯度下降法 batch gradient descent 在更新引數時使用所有的樣本來進行更新隨機梯度下降法 stochastic gradient descent 求梯度時沒有用所有的m個樣本的資料，而是僅僅選取乙個樣本j來求梯度。小批量梯度下降法 mini batch gradient d...

隨機梯度下降法

剛剛看完史丹福大學機器學習第四講牛頓法也對學習過程做一次總結吧。一誤差準則函式與隨機梯度下降數學一點將就是，對於給定的乙個點集 x，y 找到一條曲線或者曲面，對其進行擬合之。同時稱x中的變數為特徵 feature y值為值。如圖乙個典型的機器學習的過程，首先給出一組輸入資料x，我們的演算...

梯度下降法和隨機梯度下降法的區別

梯度下降和隨機梯度下降之間的關鍵區別 1 標準梯度下降是在權值更新前對所有樣例彙總誤差，而隨機梯度下降的權值是通過考查某個訓練樣例來更新的。2 在標準梯度下降中，權值更新的每一步對多個樣例求和，需要更多的計算。3 標準梯度下降，由於使用真正的梯度，標準梯度下降對於每一次權值更新經常使用比隨機梯度下降...

隨機梯度下降法

梯度下降法和隨機梯度下降法

隨機梯度下降法

梯度下降法和隨機梯度下降法的區別

相關推薦