java實現ID3演算法

id3是經典的分類演算法，要理解id3演算法，需要先了解一些基本的資訊理論概念，包括資訊量，熵，後驗熵，條件熵。id3演算法的核心思想是選擇互資訊量最大的屬性作為分割節點，這樣做可以保證所建立的決策樹高度最小。

樹結構**：

/**
* c4.5決策樹資料結構
* @author zhenhua.chen
* @description: todo
* @date 2013-3-1 上午10:47:37 
* */
public class treenode 
public string getnodename() 
public void setnodename(string nodename) 
public listgetsplitattributes() 
public void setsplitattributes(listsplitattributes) 
public arraylistgetchildrennodes() 
public void setchildrennodes(arraylistchildrennodes) 
public arraylist> getdataset() 
public void setdataset(arraylist> dataset) 
public arraylistgetarrributeset() 
public void setarrributeset(arraylistarrributeset) 
}

決策樹演算法：

/**
* 構造決策樹的類
* @author zhenhua.chen
* @description: todo
* @date 2013-3-1 下午4:42:07 
* */
public class decisiontree 
}		arraylistsplitattributes = computeutil.gettypes(dataset, index); // 獲取該節點下的**屬性
node.setsplitattributes(splitattributes);
node.setnodename(attributeset.get(index));
// 判斷每個屬性列是否需要繼續**
for(int i = 0; i < splitattributes.size(); i++)  else 
}arraylist> newdataset = new arraylist>();
for(arraylistdata : splitdataset) 
}newdataset.add(tmp);
}childnode = buildtree(newdataset, newattributeset); // 遞迴建樹
}node.getchildrennodes().add(childnode);
}		return node;	}	
/*** 列印建好的樹
* @param root
*/public void printtree(treenode root) 
} else 
if(null != root.getchildrennodes()) 
}			}	
/*** 
* @title: searchtree 
* @description: 層次遍歷樹
* @return void
* @throws
*/public void searchtree(treenode root) 
} else 
if(null != node.getchildrennodes()) 
}}	}
}

一些util**：

/**
* c4.5演算法所需的各類計算方法
* @author zhenhua.chen
* @description: todo
* @date 2013-3-1 上午10:48:47 
* */
public class computeutil 
}		return list;	}	
/*** 獲取指定資料集中指定屬性列的各個類別及其計數
* @title: getclasscounts 
* @description: todo
* @return map* @throws
*/public static mapgettypecounts(arraylist> dataset, int columnindex)  else 
}		return map;	}	
/*** 獲取指定列上指定類別的資料集合(**後的資料子集)
* @title: getdataset 
* @description: todo
* @return arraylist>
* @throws
*/public static arraylist> getdataset(arraylist> dataset, int columnindex, string attribueclass) 
}		return splitdataset;	}	
/*** 計算指定列(屬性)的資訊熵
* @title: computeentropy 
* @description: todo
* @return double
* @throws
*/public static double computeentropy(arraylist> dataset, int columnindex) 
return entropy;	}	
/*** 計算基於指定屬性列對目標屬性的條件資訊熵
*/public static double computeconditinalentropy(arraylist> dataset, int columnindex) 
double proby = (double)splitdataset.size() / (double)dataset.size();
mapmap1 = gettypecounts(splitdataset, descolumn); //根據分割後的子集計算後驗熵
iteratoriter1 = map1.keyset().iterator();
double proteriorentropy = 0;
while(iter1.hasnext()) 
conditionalentropy += proby * proteriorentropy; // 基於某個分割屬性計算條件熵
}		return conditionalentropy;
}}

測試**：

public class test 
arraylist> dataset = new arraylist>();
while((str = reader.readline()) != null) 
dataset.add(tmplist);
}decisiontree dt = new decisiontree();
treenode root = dt.buildtree(dataset, attributelist);
//				dt.printtree(root);
dt.searchtree(root);
} catch (ioexception e) 
} catch (filenotfoundexception e) 
}}

ID3演算法Java實現

1.1 資訊熵熵是無序性或不確定性的度量指標。假如事件a 的全概率劃分是 a1,a2,an 每部分發生的概率是 p1,p2,pn 那資訊熵定義為通常以2 為底數，所以資訊熵的單位是 bit。1.2 決策樹決策樹是以例項為基礎的歸納學習演算法。它從一組無次序無規則的元組中推理出決策樹表示形...

ID3演算法的java實現

id3演算法是經典的決策樹學習生成演算法。id3演算法的核心是在決策樹各個節點上運用資訊增益準則選擇特徵，遞迴的構建決策樹。具體方法是從根節點 root node 開始，對接點計算所有可能的特徵的資訊增益，選擇資訊增益最大的特徵作為節點的特徵，有該特徵的不同取值建立子節點再對子節點遞迴的呼叫以上...

ID3演算法的Python實現

本篇文章的是在 id3演算法的原理及實現 python 的基礎上進行新增和修改實現的，感謝原作者。1 新增的功能 1 拆分檔案，使得函式的呼叫更加清晰 2 增加了gui，增加了資料的讀取和功能 3 增加了乙個遞迴終止條件 2 gui介面展示以檔案中給出的資料集為例，填充如下注這裡類標籤的位...

java實現ID3演算法

ID3演算法Java實現

ID3演算法的java實現

ID3演算法的Python實現

相關推薦