決策樹matlab實現二分類或者多分類

maketree函式遞迴建立樹

tree=struct(『isnode』,1,『a』,0.0,『mark』,0.0,『child』,{}) 所有分支遞迴的存在child{}中

函式gan（d）計算d的資訊率

**可以自動適應不同的分類標籤和類別個數

function tree=maketree(d,a) 
tree=struct('isnode',1,'a',0.0,'mark',0.0,'child',{});%isnode判斷是否是分支還是葉子，a表示節點屬性，若節點是葉子，a表示分類結果，child是孩子
tree(1).a=1;%給tree分配乙個確切的記憶體
if length(unique(d(:,end)))==1%d中樣本屬於同一類別
tree.isnode=0;%把tree標記為樹葉
tree.a=d(1,end);%把tree的類別標記為d的類別
return
endif sum(a)==0 ||length(d)==0 %屬性劃分完畢
tree.isnode=0;%把tree標記為樹葉
tree.a=mode(d(:,end));%把tree的類別標記為d出現最多的類別
return
endfor i=1:length(a)
if a(i)==1
if length(unique(d(:,i)))==1
tree.isnode=0;%把tree標記為樹葉
tree.a=mode(d(:,end));%把tree的類別標記為d出現最多的類別
return
endend
endgain=zeros(length(a),1); %儲存每個屬性的資訊增益
best=zeros(length(a),1); %儲存每個屬性的最佳劃分
for i=1:length(a)
if a(i)==1
t=d(:,i);
t=sort(t);
gain1=zeros(length(t)-1,1);
for j=1:length(t)-1%二分劃分
ta=(t(j)+t(j+1))/2;
df=d(d(:,i)<=ta,:);
dz=d(d(:,i)>ta,:);
gain1(j)=ent(d)-(ent(df)*length(df(:,end))/length(d(:,end))+ent(dz)*length(dz(:,end))/length(d(:,end)));
end[gain(i),j]=max(gain1);
ta=(t(j)+t(j+1))/2;
best(i)=ta; 
endend[g,m]=max(gain);%選擇資訊增益最大的屬性
d1=d(d(:,m)<=best(m),:);
d2=d(d(:,m)>best(m),:);
a(m)=0;
tree.a=best(m); %建立分支
tree.mark=m;
% disp('****************************')
% tree.a
% tree.mark
tree.isnode=1;
tree.child(1)=maketree(d1,a);
tree.child(2)=maketree(d2,a);
endfunction f=ent(d)%計算資訊商
l=unique(d(:,end));
if length(d)==0
f=0;
return
endf=0;
t=zeros(length(l),1);
for i=1:length(d(:,end))
for j=1:length(l)
if d(i,end)==l(j)
t(j)=t(j)+1;
break;
endend
endn=length(d(:,end));
for i=1:length(l)
f=f+(t(i)/n)*log2(t(i)/n);
endf=-f;
end

在訓練集測試結果精度是0.91

該測試函式是測試隨機森林的函式，但不影響測試決策樹。

function randomforest()
clcclear all
t=1;%bagging取樣的次數
m = importdata('d:\畢業設計\資料集1\australian.txt');  %讀取資料
[sm,sn]=size(m);
% for i=1:sm             %歸一化
%     mins=min(m(i,1:sn-1));
%     maxs=max(m(i,1:sn-1));
%     for j=1:sn-1
%         m(i,j)=2*(m(i,j)-mins)/(maxs-mins)-1;
%     end
% end
indices=crossvalind('kfold',m(1:sm,sn),10); %十折交叉，劃分訓練集和測試集
testindices=(indices==1); %測試集索引
trainindices=~testindices;%訓練集索引
trainset=m(trainindices,:); %獲取訓練集
testset=m(testindices,:);%獲取測試集
[testm,~]=size(testset);
[trainm,trainn]=size(trainset);
predict=zeros(trainm,t);
for t=1:t %開始bagging取樣
d=;%訓練集
for i=1:trainm%取樣
k=randperm(trainm,1);
d=[d;trainset(k,:)];
end[~,sn]=size(d);
a=ones(sn-1,1);%屬性集合a，1代表該屬性未被劃分
tree=maketree(d,a);%遞迴構造簡單決策樹
for i=1:trainm
treet=tree;
while 1
if treet.isnode==0
predict(i,t)=treet.a;
break;
endif trainset(i,treet.mark)<=treet.a
treet=treet.child(1);
else
treet=treet.child(2);
endend
endendacc=0;
for i=1:trainm
if trainset(i,end)==mode(predict(i,:))
acc=acc+1;
endendacc=acc/trainm
end

決策樹分類 matlab程式

使用id3決策樹演算法銷量高低 clc clear 資料預處理 disp 正在進行資料預處理.matrix,attributes label,attributes id3 preprocess 構造id3決策樹，其中id3 為自定義函式 disp 資料預處理完成，正在進行構造樹.tree id3 ...

決策樹實現鳶尾花三分類

iris 鳶尾花資料集是多重變數分析的資料集。資料集包含150行資料，分為3類，每類50行資料。每行資料報含4個屬性 sepal length 花萼長度 sepal width 花萼寬度 petal length 花瓣長度和petal width 花瓣寬度可通過這4個屬性鳶尾花卉屬於三個種類...

05 分類演算法決策樹隨機森林

32支球隊，log32 5位元 64支球隊，log64 6位元資訊和消除不確定性是相聯絡的 id 年齡有工作有自己的房子信貨情況類別 1 青年否否般否 2 青年否否好否 3 青年是否好是 4 青年是是般是 5 青年否否般否 6 中年否否般否 ...

決策樹matlab實現二分類或者多分類

決策樹分類 matlab程式

決策樹實現鳶尾花三分類

05 分類演算法 決策樹 隨機森林

相關推薦

05 分類演算法決策樹隨機森林