$$\theta _ = \theta _ + \alpha \frac }$$
$$\frac } == \sum _ ^ ( y _ - h _ ( x ^ ) ) x ^ $$
所以權重的迭代更新式為:$$\theta _ = \theta _ + \alpha \sum _ ^ ( y _ - h _ ( x ^ ) ) x ^ $$批量梯度上公升【每進行一次迭代更新】就會【計算所有樣本】,因此得到的模型正確率比較高,但同時計算複雜度高,演算法耗時。計算過程如下:
1.首先根據權重和訓練樣本計算估計值;2.計算誤差;3.迭代更新
根據樣本數量進行迭代,每計算乙個樣本就進行一次更新,過程如下:(以上步驟更新m次。)
1.計算x^(i)樣本對應的估計值:$$h = \left( \begin ^ } & ^ } & ^ } \end \right) \left( \begin } \\ } \\ } \end \right)$$
2.計算誤差:注意,此處的誤差是個數,不再是個向量:$$error= y _ - h ( \text )$$
3.迭代更新:$$w = w + \alpha \left( \begin ^ } \\ ^ } \\ ^ } \end \right)error$$
def stocgradascent(datamat,labelmat,alpha = 0.01): #**或參考:logistic回歸及梯度上公升演算法隨機梯度上公升
start_time = time.time() #
記錄程式開始時間
m,n =datamat.shape
weights = np.ones((n,1)) #
分配權值為1
for i in
range(m):
h = sigmoid(np.dot(datamat[i],weights).astype('
int64
')) #
注意:這裡兩個二維陣列做內積後得到的dtype是object,需要轉換成int64
error = labelmat[i]-h #
誤差 weights = weights + alpha*datamat[i].reshape((3,1))*error #
更新權重
duration = time.time()-start_time
print('
time:
',duration)
return
weights
def gradascent(datamat,labelmat,alpha = 0.01,maxstep = 1000): #
批量梯度上公升
start_time =time.time()
m,n =datamat.shape
weights = np.ones((n,1))
for i in
range(maxstep):
h = sigmoid(np.dot(datamat,weights).astype('
int64
')) #
這裡直接進行矩陣運算
labelmat = labelmat.reshape((100,1)) #
label本為一維,轉成2維
error = labelmat-h #
批量計算誤差
weights = weights + alpha*np.dot(datamat.t,error) #
更新權重
duration = time.time()-start_time
print('
time:
',duration)
return
weights
def betterstogradascent(datamat,labelmat,alpha = 0.01,maxstep = 150):
start_time =time.time()
m,n =datamat.shape
weights = np.ones((n,1))
for j in
range(maxstep):
for i in
range(m):
alpha = 4/(1+i+j) + 0.01 #
設定更新率隨迭代而減小
h = sigmoid(np.dot(datamat[i],weights).astype('
int64'))
error = labelmat[i]-h
weights = weights + alpha*datamat[i].reshape((3,1))*error
duration = time.time()-start_time
print('
time:
',duration)
return weights
Logistic回歸演算法(梯度上公升)
logistic回歸演算法是乙個最優化演算法,回歸就是擬合的過程。logistic回歸的思想就是利用現有資料對分類邊界建立線性回歸公式,今天我們用這個演算法來解決二值分類問題。from numpy import def loaddataset datamat labelmat fr open tes...
Logistic回歸(隨機梯度上公升)
由於梯度上公升優化演算法在每次更新資料集時都需要遍歷整個資料集,計算複雜都較高,這裡有乙個隨機梯度上公升演算法也可以求得回歸係數,這種演算法一次只用乙個樣本點來更新回歸係數。def stocgradascent0 datamatrix,classlabels m,n shape datamatrix...
梯度上公升法求解Logistic回歸
對率函式h 1 1 e z z t x 1 p p yi p 1 yi 極大似然函式為 p p yi p 1 yi 假定p h xi 則p 1 h xi 則 p p yi p 1 yi h xi yi 1 h xi 1 yi 兩邊同時取對數,則 lnp ln h xi yi 1 h xi 1 yi ...