用Python實現乙個決策樹分類器

本文將會介紹如何用python實現乙個決策樹分類器。主要包含下面兩個方面：

簡單來說，決策樹演算法把資料按照樹的結構分成了一系列決策節點。每乙個決策節點都是乙個問題，然後可以根據這個問題把資料分成兩個或多個子節點。這個數一直往下建立，知道最終所有的資料都屬於乙個類。建立乙個最佳決策的標準就是資訊增益。下圖就是乙個簡單的決策樹示意圖：

使用決策樹分類器來train機器學習model就是找到決策樹的邊界。

通過把feature空間分成多個長方形，決策樹可以建立很多複雜的決策邊界。下面就是我們使用決策樹分類器來訓練sklearn iris資料的決策邊界示意圖。feature空間分別由petal length和petal width組成，後面我們會給出詳細的**：

下面就是乙個示例**：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.tree import decisiontreeclassifier
iris = datasets.load_iris()
x = iris.data[:, 2:]
y = iris.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=1, stratify=y)
clf_tree = decisiontreeclassifier(criterion='gini', max_depth=4, random_state=1)
clf_tree.fit(x_train, y_train)

下面這個**可以用來建立上面顯示的決策邊界圖，我們需要install mlxtend這個包：

from mlxtend.plotting import plot_decision_regions
x_combined = np.vstack((x_train, x_test))
y_combined = np.hstack((y_train, y_test))
fig, ax = plt.subplots(figsize=(7, 7))
plot_decision_regions(x_combined, y_combined, clf=clf_tree)
plt.xlabel('petal length [cm]')
plt.ylabel('petal width [cm]')
plt.legend(loc='upper left')
plt.tight_layout()
plt.show()

下面是用數的結構來顯示結果，我們會使用sklearn tree類中的plot_tree函式，**如下：

from sklearn import tree
fig, ax = plt.subplots(figsize=(10, 10))
tree.plot_tree(clf_tree, fontsize=10)
plt.show()

下圖就是上面**顯示的結果，注意我們要用plt.subplots(figsize=(10, 10))來讓圖形大一點，否則顯示的很小：

好了，簡單的python實現的決策樹分類器就介紹到這裡了。

歡迎關注個人小站：

用Python實現乙個決策樹分類器

python實現決策樹

決策樹演算法 python實現

Python實現決策樹演算法

用Python實現乙個決策樹分類器

python實現決策樹

決策樹演算法 python實現

Python實現決策樹演算法

相關推薦