了解樸素貝葉斯演算法基本原理;
能夠使用樸素貝葉斯演算法對資料進行分類
編寫函式實現示例資料集輸出結果。
rstudio
準備資料data,希望分類的元組x為test
主函式 *****bayes = function(){}
公式 p(ci|x) = p(x|ci) p(ci) / p(x)
求p(c1),p(c2)
pxi_ci = function(data,test,class_result){}
得pxi_c1,pxi_c2
p(x|c1) = p(x1|c1)*p(x2|c1)*p(x3|c1)…
p(x|c2) = p(x1|c2)*p(x2|c2)*p(x3|c2)…
測試*****bayes(data,test)
data
age=c(
"youth","youth","middle_aged","senior","senior","senior","middle_aged","youth","youth","senior","youth","middle_aged","middle_aged","senior"
), income=c(
"high","high","high","medium","low","low","low","medium","low","medium","medium","medium","high","medium"
), student=c(
"no","no","no","no","yes","yes","yes","no","yes","yes","yes","no","yes","no"
), credit_rating=c(
"fair","excellent","fair","fair","fair","excellent","excellent","fair","fair","fair","excellent","excellent","fair","excellent"
), buys_computer=c(
"no","no","yes","yes","yes","no","yes","no","yes","yes","yes","yes","yes","no))
#yes
test
yes",credit_rating="fair")
*****bayes = function(data,test)
pc = c()
for (i in 1:length(class_result))
pc[1]
pc[2]
#####求p(xi|ci)
pxi_ci = function(data,test,class_result)
}} temp = subset(data,data[,ncol(data)] == class_result)
pxi_ci = xcount/nrow(temp)
return(pxi_ci)
} pxi_c1= pxi_ci(data,test,class_result[1]) #"no"
pxi_c2= pxi_ci(data,test,class_result[2]) #"
yes"
#####求p(x|ci)
px_ci = function(pxi_ci)
return(result)
}px_c1 = px_ci(pxi_c1)
px_c2 = px_ci(pxi_c2)
#######p(ci|x)
ci_x = data.frame(c1_x = px_c1*pc[1],c2_x = px_c2*pc[2])
select = function(data)else
return(data)
}final = select(ci_x)
return(final)
}#測試
*****bayes(data,test)
因此對於元組x,樸素貝葉斯分類**元組x的類為yes 樸素貝葉斯分類
1 貝葉斯分類是一類分類演算法的總稱,這類演算法均以貝葉斯定理為基礎,故統稱為貝葉斯分類。2 樸素貝葉斯的思想基礎是這樣的 對於給出的待分類項,求解在此項出現的條件下各個類別出現的概率,哪個最大,就認為此待分類項屬於哪個類別。通俗來說,就好比這麼個道理,你在街上看到乙個黑人,我問你你猜這哥們 來的,...
樸素貝葉斯分類
摘自寫在公司內部的wiki 要解決的問題 表中增加欄位classification,有四個取值 0 初始值,未分類 1 positive 2 normal 99 negative review submit前,由樸素貝葉斯分類器決定該條review的flag屬於negative還是positive ...
分類 樸素貝葉斯
原始的貝葉斯公式為 p b a p a b p a p a b p b p a 1 在分類問題中,y為類別,x為樣本特徵,則已知待 的樣本特徵 x 它為類別yi 的概率為 p yi x p x yi p y i p x p yi jp xj y i p x 2 p yi 類別為y i的樣本 數總樣本...