使用kNN實現手寫體識別

【knn的總結】本質就是使用測試與樣本進行比較，找到k個最近的，在k個中選擇概率出現最高的那乙個，把數字記錄下來，這個數字就是最終目標。步驟如下： 1）資料的載入。注意是隨機數的載入有4組，分別為訓練資料，訓練標籤，測試，測試標籤 2）計算測試與訓練的距離 3）計算k個最近的（實際上就是排序） 4）將得到的最近的轉換為標籤，並且對標籤按照少數服從多數的原則，得到最終的標籤 5）檢測概率統計（將測試得到的標籤與實際的標籤進行比較）

import tensorflow as tf

import numpy as np

import random

from tensorflow.examples.tutorials.mnist import input_data

#匯入mnist資料集

mnist = input_data.read_data_sets(r"path",one_hot=true)

#屬性設定

trainnum=55000

testnum=10000

trainsize=1000

testsize=500

k=20

#下來將資料進行分解

trainindex=np.random.choice(trainnum,trainsize,replace=false)

testindex=np.random.choice(testnum,testsize,replace=false)

traindata=mnist.train.images[trainindex]#訓練

trainlabel=mnist.train.labels[trainindex]#訓練標籤

testdata=mnist.test.images[testindex]#測試

testlabel=mnist.test.labels[testindex]#測試標籤

#資料定義好了之後，就需要用tensorflow來定義輸入（需要的訓練資料就已經定義好了）

traindatainput=tf.placeholder(shape=[none,784],dtype=tf.float32)

#正確的標籤

trainlabelinput=tf.placeholder(shape=[none,10],dtype=tf.float32)#到這訓練資料的資料和標籤就已經生成

#再把測試資料和測試標籤生成一下

testdatainput=tf.placeholder(shape=[none,784],dtype=tf.float32)

testlabelinput=tf.placeholder(shape=[none,10],dtype=tf.float32)#到這裡測試資料的資料和標籤就已經準備好

#在資料全部準備完之後，就可以開始進行訓練了

#計算knn距離

f1=tf.expand_dims(testdatainput,1)#將當前的輸入資料增加一項這樣轉換的目的是要用來計算資料應該是乙個3維資料（3d）

f2=tf.subtract(traindatainput,f1)#就得到了3維資料，測試資料與500個的距離

f3=tf.reduce_sum(tf.abs(f2),reduction_indices=2)#這一步完成資料的累加，這裡的差值是取絕對值之後的f3是乙個（5*500的）

f4=tf.negative(f3)#p4完成取反功能

f5,f6=tf.nn.top_k(f4,k=20)#選取f4中最大的四個值，相當於f3中最小的四個值，f5存的是最近的距離，f6存入的是最近的值的下標

f7=tf.gather(trainlabelinput,f6)#f6存放的是最近的點的下標，根據下標來索引標籤

#最後一步應該是將當前的lbel轉換為數字

f8=tf.reduce_sum(f7,reduction_indices=1)#將豎直方向的量進行累加，這樣少數到時候服從多數，豎直方向相加的值代表了哪個次數最大

f9=tf.arg_max(f8,dimension=1)#tf.argmax代表的是找最大的數值所對應的下標

with tf.session()as sess:

p1=sess.run(f1,feed_dict=)

print('p1=',p1.shape)

p2=sess.run(f2,feed_dict=)

print('p2=',p2.shape)#p2=(5,5000,784)

p3=sess.run(f3,feed_dict=)

print('p3=',p3.shape)

print('p3[0,0]=',p3[0,0])

p4=sess.run(f4,feed_dict=)

print('p4=',p4.shape)

print('p4[0,0]',p4[0,0])

p5,p6=sess.run((f5,f6),feed_dict=)

#每一張測試（5張）分別對應值的最近的4張

print('p5=',p5.shape)

print('p6=',p6.shape)

print('p5',p5[0])

print('p6',p6[0])#到這裡距離和下標已經知道，但並不知道描述的是哪些點，因此需要解析這四個最近點的內容

使用kNN實現手寫體識別

KNN手寫體數字識別

編寫knn演算法實現手寫體識別

python使用KNN演算法手寫體識別

使用kNN實現手寫體識別

KNN手寫體數字識別

編寫knn演算法實現手寫體識別

python使用KNN演算法手寫體識別

相關推薦