機器學習決策樹

決策樹
0. 載入相關模組
from sklearn.datasets import load_iris
from sklearn.tree import decisiontreeclassifier
from sklearn.model_selection import train_test_split
import seaborn as sns
from sklearn.metrics import accuracy_score
# 用於在jupyter中進行繪圖
%matplotlib inline
1. 載入資料
iris = load_iris(
)1.1 資料預覽
print
('特徵名稱：'
, iris.feature_names)
特徵名稱： [
'sepal length (cm)'
,'sepal width (cm)'
,'petal length (cm)'
,'petal width (cm)'
]print
('類別：'
, iris.target_names)
類別： [
'setosa'
'versicolor'
'virginica'
]1.2 資料處理
x = iris.data
y = iris.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=1/
4, random_state=0)
print
('資料集樣本數：{}，訓練集樣本數：{}，測試集樣本數：{}'
.format
(len
(x),
len(x_train)
,len
(x_test)))
資料集樣本數：150，訓練集樣本數：112，測試集樣本數：38
2. 建立模型
dt_model = decisiontreeclassifier(max_depth=3)
3. 訓練模型
dt_model.fit(x_train, y_train)
decisiontreeclassifier(class_weight=
none
, criterion=
'gini'
, max_depth=3,
max_features=
none
, max_leaf_nodes=
none
,                       min_impurity_decrease=
0.0, min_impurity_split=
none
,                       min_samples_leaf=
1, min_samples_split=2,
min_weight_fraction_leaf=
0.0, presort=
false
,                       random_state=
none
, splitter=
'best')4
. 測試模型
y_pred = dt_model.predict(x_test)
acc = accuracy_score(y_test, y_pred)
print
('準確率：'
, acc)
準確率： 0.9736842105263158
5. 檢視超引數的影響
max_depth_values =[2
,3,4
]for max_depth_val in max_depth_values:
dt_model = decisiontreeclassifier(max_depth=max_depth_val)
dt_model.fit(x_train, y_train)
print
('max_depth='
, max_depth_val)
print
('訓練集上的準確率: '
.format
(dt_model.score(x_train, y_train)))
print
('測試集的準確率: '
.format
(dt_model.score(x_test, y_test)))
print()
max_depth=
2訓練集上的準確率:
0.964
測試集的準確率:
0.895
max_depth=
3訓練集上的準確率:
0.982
測試集的準確率:
0.974
max_depth=
4訓練集上的準確率:
1.000
測試集的準確率:
0.974
6. 決策樹視覺化
需要安裝:
graphviz程式(已提供在**目錄下)，並將安裝目錄下的bin目錄新增到環境變數中，重啟jupyter或系統生效。如：c:\program files (x86)\graphviz2.
38\bin 新增到系統path環境變數中。
graphviz模組, pip install graphviz
from ml_visualization import plot_decision_tree
dt_model = decisiontreeclassifier(max_depth=4)
dt_model.fit(x_train, y_train)
plot_decision_tree(dt_model, iris.feature_names, iris.target_names)--
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
----
-print
(iris.feature_names)
print
(dt_model.feature_importances_)
from ml_visualization import plot_feature_importances
plot_feature_importances(dt_model, iris.feature_names)

機器學習決策樹

一基本概念決策樹 decision tree 是一種基本的分類與回歸方法。決策樹模型呈樹形結構，在分類問題中，表示屬於特徵對例項進行分類的過程，它可以認為是if then規則的集合，也可以認為是電議在特徵空間與類空空上的條件概率分布，其主要優點是模型具有可讀性，分類速度快。決策樹的學習通常包括3...

機器學習決策樹

我覺得決策樹是機器學習所有演算法中最可愛的了沒有那麼多複雜的數學公式哈哈下圖是一棵決策樹，用來判斷西瓜是好瓜還是壞瓜決策過程中提出的每個判定問題都是都對某個屬性的測試，每個測試結果要麼推導出最終結論，要麼匯出進一步判斷的問題，在上次決策結果限定的範圍內做進一步判斷。從上圖可以看出，葉節點對應決...

機器學習決策樹

一演算法簡介決策樹一般都是自上而下來生成的，每個決策後事件即自然狀態都可能引出兩個或多個事件，導致結果的不同，把這種結構分支畫成形狀很像一棵樹的枝幹，故稱為決策樹。決策樹能夠讀取資料集合，並且決策樹很多任務都是為了資料中所蘊含的知識資訊，因此決策樹可以使用不熟悉的資料集合，並從中提取一系列規...

機器學習 決策樹

機器學習 決策樹

機器學習 決策樹

機器學習 決策樹

相關推薦

機器學習決策樹

機器學習決策樹

機器學習決策樹

機器學習決策樹