手寫數字識別KNN 記錄

2021-08-30 19:50:03 字數 3656 閱讀 7467

pycharm中引入tensorflow庫

1. expand_dims()函式作用

2. tf.reduce_sum()函式作用

3. tf.argmax()函式作用

4. 附上**

# 1 load data    隨機數載入

# 2 knn test train distance

# 3 knn 中k個最近的from 500 tarindata pictures according to 計算出來的distance

# 4 parse content解析k個最近中的標籤數字內容

# 5 數字<=label

# 6 識別正確率統計

import tensorflow as tf

import numpy as np

import random

# is used to read data

from tensorflow.examples.tutorials.mnist import input_data

# load data 1.filename 2.one_hot 1 0000000

mnist = input_data.read_data_sets('c:\\users\\administrator\\desktop\\mnist', one_hot=true)

trainnum = 55000

testnum = 10000

trainsize = 500

testsize = 5

k = 4

# data分解 replace=false 為不可重複,這樣隨機取train、test樣本

trainindex = np.random.choice(trainnum, trainsize, replace=false)

testindex = np.random.choice(testnum, testsize, replace=false)

traindata = mnist.train.images[trainindex]

trainlabel = mnist.train.labels[trainindex]

testdata = mnist.test.images[testindex]

testlabel = mnist.test.labels[testindex]

print('traindata.shape=', traindata.shape)

print('testlabel=', testlabel)

# tf input

traindatainput = tf.placeholder(shape=[none, 784], dtype=tf.float32)

trainlabelinput = tf.placeholder(shape=[none, 10], dtype=tf.float32)

testdatainput = tf.placeholder(shape=[none, 784], dtype=tf.float32)

testlabelinput = tf.placeholder(shape=[none, 10], dtype=tf.float32)

# knn distance

# 5 500 784 (3d)=2500*784

f1 = tf.expand_dims(testdatainput, 1) # 維度擴充套件5*784=>5*1*784

f2 = tf.subtract(traindatainput, f1) # f2(5*500*784)=traindatainput(500*784)-f1(5*1*784)

f3 = tf.reduce_sum(tf.abs(f2), reduction_indices=2) # 完成資料累加 f3(5*500)<=f2(5*500*784)第二維累加

# f3:5*500 測試和訓練的差值計算結果

f4 = tf.negative(f3) # 取反

f5, f6 = tf.nn.top_k(f4, k=10) # 選取f4中最大的四個值即f3中最小的4個值

# f6 4個最近的下標

f7 = tf.gather(trainlabelinput, f6)

f8 = tf.reduce_sum(f7, reduction_indices=1)

# tf.argmax取出相似label最集中的label

f9 = tf.argmax(f8, dimension=1)

with tf.session() as sess:

p1 = sess.run(f1, feed_dict=)

print('p1=', p1.shape) # p1=(5,1,784)

p2 = sess.run(f2, feed_dict=)

print('p2=', p2.shape) # p2=(5,500,784)

p3 = sess.run(f3, feed_dict=)

print('p3=', p3.shape)

print('p3[0,0]=', p3[0, 0]) # knn distance

p4 = sess.run(f4, feed_dict=)

print('p4=', p4.shape)

print('p4[0,0]=', p4[0, 0])

p5, p6 = sess.run((f5, f6), feed_dict=)

# p5=(5,4) 每一張測試跟距離最近的4張訓練的畫素差值

# p6=(5,4) 每一張測試距離最近的4張訓練的下標

print('p5=', p5.shape)

print('p6=', p6.shape)

print('p5[0,0]=', p5[0, 0])

print('p6[0]=', p6[0])

p7 = sess.run(f7, feed_dict=)

print('p7=', p7.shape) # p7=(5,4,10)

print('p7=', p7)

p8 = sess.run(f8, feed_dict=)

print('p8=', p8.shape) # p7=(5,4,10)

print('p8=', p8)

p9 = sess.run(f9, feed_dict=)

print('p9=', p9.shape) # p7=(5,4,10)

print('p9=', p9)

p10 = np.argmax(testlabel[0:5], axis=1)

print('p10=', p10)

# 測算識別正確率

j = 0

for i in range(0, 5):

if p10[i] == p9[i]:

j = j + 1

print('ac=', j / 5*100)

KNN手寫數字識別

以歐幾里得距離度量樣本間的相似程度。對於乙個測試樣本首先計算該樣本與每個訓練樣本間的距離,然後按距離值對訓練樣本進行公升序排序,排序後前k個樣本進行投票,即哪個標籤出現的次數多,就將測試樣例劃為該類。程式使用資料 預先將資料處理為,標籤資訊轉化為txt文件。from numpy import imp...

kNN 手寫數字識別

識別手寫的數字0 9,影象為32畫素 32畫素的黑白影象 1.將影象轉換為向量 將32 32的二進位制影象矩陣轉換為1 1024的向量。將影象轉化為向量 defimgvector filename returnvect zeros 1,1024 fr open filename 讀取檔案的前32行 ...

KNN手寫數字識別

import numpy as np import matplotlib pyplot as plt from sklearn.neighbors import kneighborsclassifier 讀取樣本資料,目標 0,1,2,3,4,5,6,7,8,9 feature target for...