一周演算法實踐day2 整合模型構建

使用之前的資料data_all.csv

在程式編寫完之後進行執行會出現多個警告：deprecationwarning: the truth value of an empty array is ambiguous. returning false, but in future this will result in an error. usearray.size > 0to check that an array is not empty. if diff:

警告的意思是：空陣列的真值是不明確的。返回false，但會導致錯誤。使用array.size> 0來檢查陣列是否為空。

解決：忽略警告：由於numpy在空陣列上棄用了真值檢查而出現的警告，可以直接忽略這個問題，新增如下**：

import warnings
warnings.filterwarnings(
"ignore"
)

以上警告則會消失。

#匯入包
from sklearn.ensemble import randomforestclassifier
from sklearn.ensemble import gradientboostingclassifier
from sklearn.model_selection import train_test_split
from xgboost import xgbclassifier
from lightgbm import lgbmclassifier
import pandas as pd
import warnings
#忽略警告：由於numpy在空陣列上棄用了真值檢查而出現的警告，可以直接忽略這個問題
#警告詳情：deprecationwarning: the truth value of an empty array is ambiguous. returning false, but in future this will result in an error. use `array.size > 0` to check that an array is not empty. if diff:
#翻譯：空陣列的真值是不明確的。返回false，但會導致錯誤。使用`array.size> 0`來檢查陣列是否為空。
warnings.filterwarnings(
"ignore"
)#讀取資料
data_all = pd.read_csv(
'data_all.csv'
)print
("資料行列數"
,data_all.shape)
#劃分資料集
x = data_all.drop(
['status'
],axis=1)
#'status'列是標籤
y = data_all[
'status'
]x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=
0.3,random_state=
2018
)#構建模型
#1.隨機森林
rfc = randomforestclassifier(
)rfc.fit(x_train,y_train)
rfc_score = rfc.score(x_test,y_test)
#2.gbdt
gbc = gradientboostingclassifier(
)gbc.fit(x_train,y_train)
gbc_score = gbc.score(x_test,y_test)
#3.xgboost
xgbc = xgbclassifier(
)xgbc.fit(x_train,y_train)
xgbc_score = xgbc.score(x_test,y_test)
#4.ligthgbm
lgbc = lgbmclassifier(
)lgbc.fit(x_train,y_train)
lgbc_score = lgbc.score(x_test,y_test)
print
("randomforestclassifier acc: %f, gradientboostingclassifier acc: %f"
%(rfc_score, gbc_score)
)print
("xgbclassifier acc: %f, lgbmclassifier acc: %f"
%(xgbc_score, lgbc_score)
)

randomforestclassifier acc: 0.763139 , gradientboostingclassifier acc: 0.779958 xgbclassifier acc: 0.785564 , lgbmclassifier acc:

0.770147

一周演算法實踐day1 模型構建

這份資料集是金融資料非原始資料，已經處理過了我們要做的是貸款使用者是否會逾期。中 status 是結果標籤 0表示未逾期，1表示逾期。data all pd.read csv data all.csv x train,x test,y train,y test train test split...

《演算法筆記》Day 2

全排列問題 include const int maxn 11 int n,p maxn hashtable maxn void generatep int index printf n return for int x 1 x n x int main void 推演 hashtable fals...

一周演算法專案實踐（四）

使用網格搜尋法對7個模型進行調優調參時採用五折交叉驗證的方式並進行模型評估 import pandas as pd import numpy as np from sklearn.model selection import train test split from sklearn.prepr...

一周演算法實踐day2 整合模型構建

一周演算法實踐day1 模型構建

《演算法筆記》Day 2

一周演算法專案實踐（四）

相關推薦