ID3演算法的java實現

id3演算法是經典的決策樹學習生成演算法。id3演算法的核心是在決策樹各個節點上運用資訊增益準則選擇特徵，遞迴的構建決策樹。具體方法是：從根節點（root node）開始，對接點計算所有可能的特徵的資訊增益，選擇資訊增益最大的特徵作為節點的特徵，有該特徵的不同取值建立子節點；再對子節點遞迴的呼叫以上方法，構建決策樹；直到所有的特徵的資訊增益均很小或者沒有特徵可以選取為止。最後得到乙個決策樹。要理解id3演算法，需要先了解一些基本的資訊理論概念，包括資訊量，熵，後驗熵，條件熵。

/** 
* c4.5決策樹資料結構 
*@author zhenhua.chen 
*@description: todo 
*@date 2013-3-1 上午10:47:37  
*  */
public
class
treenode   
public string getnodename()   
public
void
setnodename(string nodename)   
public listgetsplitattributes()   
public
void
setsplitattributes(listsplitattributes)   
public arraylistgetchildrennodes()   
public
void
setchildrennodes(arraylistchildrennodes)   
public arraylist> getdataset()   
public
void
setdataset(arraylist> dataset)   
public arraylistgetarrributeset()   
public
void
setarrributeset(arraylistarrributeset)   
}

/** 
* 構造決策樹的類 
*@author zhenhua.chen 
*@description: todo 
*@date 2013-3-1 下午4:42:07  
*  */
public
class
decisiontree   
}  arraylistsplitattributes = computeutil.gettypes(dataset, index); // 獲取該節點下的**屬性  
node.setsplitattributes(splitattributes);  
node.setnodename(attributeset.get(index));  
// 判斷每個屬性列是否需要繼續**  
for(int i = 0; i < splitattributes.size(); i++)  else   
}  arraylist> newdataset = new arraylist>();  
for(arraylistdata : splitdataset)   
}  newdataset.add(tmp);  
}  childnode = buildtree(newdataset, newattributeset); // 遞迴建樹  
}  node.getchildrennodes().add(childnode);  
}  return node;  
}  /** 
* 列印建好的樹 
*@param root 
*/public
void
printtree(treenode root)   
} else   
if(null != root.getchildrennodes())   
}  }  
/** 
*  *@title: searchtree  
*@description: 層次遍歷樹 
*@return void 
*@throws 
*/public
void
searchtree(treenode root)   
} else   
if(null != node.getchildrennodes())   
}  }  
}  }

/** 
* c4.5演算法所需的各類計算方法 
*@author zhenhua.chen 
*@description: todo 
*@date 2013-3-1 上午10:48:47  
*  */
public
class
computeutil   
}  return list;  
}  /** 
* 獲取指定資料集中指定屬性列的各個類別及其計數 
*@title: getclasscounts  
*@description: todo 
*@return map*@throws 
*/public
static mapgettypecounts(arraylist> dataset, int columnindex)  else   
}  return map;  
}  /** 
* 獲取指定列上指定類別的資料集合(**後的資料子集) 
*@title: getdataset  
*@description: todo 
*@return arraylist> 
*@throws 
*/public
static arraylist> getdataset(arraylist> dataset, int columnindex, string attribueclass)   
}  return splitdataset;  
}  /** 
* 計算指定列(屬性)的資訊熵 
*@title: computeentropy  
*@description: todo 
*@return double 
*@throws 
*/public
static
double
computeentropy(arraylist> dataset, int columnindex)   
return entropy;  
}  /** 
* 計算基於指定屬性列對目標屬性的條件資訊熵 
*/public
static
double
computeconditinalentropy(arraylist> dataset, int columnindex)   
double proby = (double)splitdataset.size() / (double)dataset.size();  
mapmap1 = gettypecounts(splitdataset, descolumn); //根據分割後的子集計算後驗熵  
iteratoriter1 = map1.keyset().iterator();  
double proteriorentropy = 0;  
while(iter1.hasnext())   
conditionalentropy += proby * proteriorentropy; // 基於某個分割屬性計算條件熵  
}  return conditionalentropy;  
}  }

public
class
test   
arraylist> dataset = new arraylist>();  
while((str = reader.readline()) != null)   
dataset.add(tmplist);  
}  decisiontree dt = new decisiontree();  
treenode root = dt.buildtree(dataset, attributelist);  
//              dt.printtree(root);  
dt.searchtree(root);  
} catch (ioexception e)   
} catch (filenotfoundexception e)   
}  }

ID3演算法的java實現

ID3演算法Java實現

java實現ID3演算法

ID3演算法的Python實現

ID3演算法的java實現

ID3演算法Java實現

java實現ID3演算法

ID3演算法的Python實現

相關推薦