機器學習邏輯回歸

邏輯：邏輯，源自古典希臘語 (logos)，最初的意思是「詞語」或「言語」，引申意思是「思維」或「推理」。 2023年，教育家嚴復將其意譯為「名學」，音譯為「邏輯」。

回歸：回歸是統計學的乙個重要概念，其本意是根據之前的資料**乙個準確的輸出值。

邏輯回歸是目前使用最為廣泛的一種學習演算法，用於解決分類問題。與線性回歸演算法一樣，也是監督學習演算法。

對於分類問題分類，y 取值為0 或者1。如果使用線性回歸，那麼線性回歸模型的輸出值可能遠大於1，或者遠小於0。導致代價函式很大。

邏輯函式(logical function)是一種描述數位電路特點的工具。輸出量是高、低電平，可以用二元常量(0，1)來表示。

hθ(x) = 11+

e−xθ

\frac}

1+e−xθ

1關於這個模型為了方便後面sigmoid函式的計算我們需要對其求導，求導過程如下：

這裡用到的代價函式與我們之前用到的不同，需要用交叉熵代價函式

j(θ) = −1m

∑i=1

m[y(

i)ln

(hθ(

x(i)

))+(

1−y(

i))l

n(1−

hθ(x

(i))

)]-\frac\sum_^m[y^ln(h_θ(x^)) + (1-y^)ln(1-h_θ(x^))]

−m1∑i

=1m

[y(i

)ln(

hθ(

x(i)

))+(

1−y(

i))l

n(1−

hθ(

x(i)

))]因為y的取值不是0就是1，所以這樣寫可以直接在一方為0是另乙個可以計算

該代價函式j(θ)是乙個凸函式，並且沒有區域性最優值；

因為代價函式是凸函式，無論在**初始化，最終達到這個凸函式的最小值點。

在得到代價函式以後，便可以用梯度下降演算法來求得能使代價函式最小的引數了。

import numpy as np
import matplotlib.pyplot as plt
plt.rcparams[
'font.sans-serif']=
['simhei'
]plt.rcparams[
'axes.unicode_minus']=
false
# 讀取資料
data_train = np.loadtxt(r'c:\users\shy\pycharmprojects\untitled\week4\mushroomtrain.txt'
,delimiter=
',')
data_test = np.loadtxt(r'c:\users\shy\pycharmprojects\untitled\week4\mushroomtest.txt'
,delimiter=
',')
# 提取資料
train_x,train_y = data_train[:,
:-1]
,data_train[:,
-1]test_x,test_y = data_test[:,
:-1]
,data_test[:,
-1]# 資料預處理
defpropeross
(x,y)
:# 特徵縮放
x -= np.mean(x,axis=0)
x /= np.std(x,axis=
0,ddof=1)
# 資料初始化
x = np.c_[np.ones(
len(x)
),x]
y = np.c_[y]
return x,y
train_x,train_y = propeross(train_x,train_y)
test_x,test_y = propeross(test_x,test_y)
# 啟用函式
defg
(z):
h =1.0/(1
+np.exp(
-z))
return h
defmodel
(x,theta)
:    z = np.dot(x,theta)
h = g(z)
return h
# 代價
defcostfunc
(h,y,r)
:    m =
len(h)
j =-1.0
/m*np.
sum(y*np.log(h)+(
1-y)
*np.log(
1-h)
)+ r
return j
# 梯度下降
defdraedesc
(x, y, alpha=
0.01
, iter_num=
100, lamba=60)
:    m,n = x.shape
theta = np.zeros(
(n,1))
j_history = np.zeros(iter_num)
for i in
range
(iter_num)
:        h = model(x,theta)
theta_r = theta.copy(
)        theta_r[0]
=0r = lamba/(2
* m)
*np.
sum(np.square(theta_r)
)        j_history[i]
= costfunc(h,y,r)
deltatheta =
1.0/m *
(np.dot(x.t,h-y)
+lamba*theta_r)
theta -= alpha*deltatheta
return j_history,theta
j_history,theta = draedesc(train_x,train_y)
# 準確率
defscore
(h, y)
:    count =
0for i in
range
(len
(h))
:if np.where(h[i]
>=
0.5,1,
0)== y[i]
:            count +=
1return count/
len(h)
# 獲得**值
train_h =model(train_x,theta)
test_h = model(test_x,theta)
print
('訓練集準測率為'
,score(train_h,train_y)
)print
('測試集準測率為'
,score(test_h,test_y)
)# 畫圖
defdraw
(x,y,title)
:    plt.title(title)
plt.scatter(x[y[:,
0]==0
,2],x[y[:,
0]==0
,3],label=
'負相關'
)    plt.scatter(x[y[:,
0]==1
,2],x[y[:,
0]==1
,3],label=
'正相關'
)    plt.legend(
)    plt.show(
)draw(train_x,train_y,
'訓練集'
				機器學習  邏輯回歸
邏輯回歸 線性回歸的式子，作為邏輯回歸的輸入 適用場景 二分類 線性回歸的輸入 sigmoid函式 分類 0,1 概率值 計算公式 當目標值為1時 損失函式的變化 當目標值為0時 損失函式的變化 下面用乙個例項來說明邏輯回歸的用法 癌症概率 部分資料的截圖如下 資料描述 699條樣本，供11列資料，...
				機器學習  邏輯回歸
coding utf 8 import pandas as pd import seaborn as sns from sklearn.model selection import train test split import matplotlib.pyplot as plt 用於畫圖 from ...
				機器學習 邏輯回歸
lr指的是logistic regression，邏輯回歸。而不是linear regression，線性回歸，不要問為什麼，記住它就好了，haha。它是一種監督學習分類演算法，不是回歸演算法！這裡千萬要注意啦。lr常用於二分類問題，0或者1 假如我們有一堆二維資料，也就是這堆資料有2個特徵x1和x...

機器學習 邏輯回歸

機器學習 邏輯回歸

機器學習 邏輯回歸

機器學習 邏輯回歸

相關推薦

機器學習邏輯回歸

機器學習邏輯回歸

機器學習邏輯回歸

機器學習邏輯回歸