本篇文章為閱讀
源**後整理的筆記。
部分變數我還是理解的不到位(如下面打問號的地方),希望有大神能多多指教!
配置引數準備
train_input_size(大小):416
strides(跨度):取值 [8, 16, 32]
類別:80個類別
batch_size:4
train_output_sizes(?):
計算方法:train_input_size // self.strides
取值:52/26/13
max_bbox_per_scale(每個尺度的bbox數量最大值):150
anchor_per_scale(每個尺度的錨框數量):3
anchors(錨點):共3x3=9組,每組2個數值
[ [1.25,1.625, 2.0,3.75, 4.125,2.875,],
[1.875,3.8125, 3.875,2.8125, 3.6875,7.4375,],
[3.625,2.8125, 4.875,6.1875, 11.65625,10.1875] ]
輸入資料準備
每批次的資料(batch_image):
形狀:(4, 416, 416, 3)
(batch_size, train_input_size, train_input_size, 3)
輸出標籤準備:
batch_label_sbbox(小檢測框的標籤):
形狀:(4, 52, 52, 3, 85)
(batch_size, train_output_sizes[0], train_output_sizes[0], anchor_per_scale, 5 + num_classes)
batch_label_mbbox(中檢測框的標籤):
形狀:(4, 26, 26, 3, 85)
(batch_size, train_output_sizes[1], train_output_sizes[1], anchor_per_scale, 5 + num_classes)
batch_label_lbbox(大檢測框的標籤):
形狀:(4, 13, 13, 3, 85)
(batch_size, train_output_sizes[2], train_output_sizes[2], anchor_per_scale, 5 + num_classes)
batch_sbboxes/batch_mbboxes/batch_lbboxes
形狀:(4, 150, 4)
(batch_size, max_bbox_per_scale, 4)
形狀:[(52, 52, 3, 85), (26, 26, 3, 85), (13, 13, 3, 85),]
label = [
(train_output_sizes[0], train_output_sizes[0], anchor_per_scale, 5 + self.num_classes) ,
(train_output_sizes[1], train_output_sizes[1], anchor_per_scale, 5 + self.num_classes) ,
(train_output_sizes[2], train_output_sizes[2], anchor_per_scale, 5 + self.num_classes) ,
]bboxes_xywh(按順序儲存每個尺度的bbox座標)
形狀:[3, 150, 4]
[尺度數, max_bbox_per_scale, boxes_xywh]
每個bbox標籤:
舉例:bbox = [33, 294, 55, 316, 6]
類別標籤使用smooth_onehot處理
bbox_xywh(將bbox轉換成xywh(其中xy為中心點座標)):
數值:44, 305, 22, 22
bbox_xywh_scaled(將bbox_xywh除以跨度):
公式:bbox_xywh // strides
數值:[[5.5, 38.125, 2.75, 2.75], [2.75, 19.0625, 1.375, 1.375], [1.375, 9.53125, 0.6875, 0.6875]]
錨框的標籤:
anchors_xywh:
形狀:(3, 4)
(anchor_per_scale, 4)
數值:[[ 5.5 38.5 1.25 1.625], [ 5.5 38.5 2. 3.75 ], [ 5.5 38.5 4.125 2.875]]
計算方法:xy值為bbox_xywh_scaled的xy,wh為anchors每組的數值
iou_scale(計算錨框的iou):
偽**:
iou_scale = bbox_iou(bbox_xywh_scaled , anchors_xywh)
數值:[0.26859504, 0.5751634, 0.52702703]
iou_mask = iou_scale > 0.3
數值:[false true true]
如果其中乙個iou_mask為true(如果都為false,則拿iou_scale最大的作為錨框):
label = [i][yind, xind, iou_mask, 資料]:
i:縮放的尺度
yind, xind:中心點落在哪個框中
iou_mask:為true才能寫入後面的資料
資料:共85維=[bbox_xywh, 1, smooth_onehot]
**:xind, yind = np.floor(bbox_xywh_scaled[i, 0:2]).astype(np.int32)
label[i][yind, xind, iou_mask, :] = 0
label[i][yind, xind, iou_mask, 0:4] = bbox_xywh
label[i][yind, xind, iou_mask, 4:5] = 1.0
label[i][yind, xind, iou_mask, 5:] = smooth_onehot
label_xbbox為具體每個檢測框的標籤:
label_sbbox, label_mbbox, label_lbbox = label
形狀:label_sbbox為例:(52, 52, 3, 85)
其中52, 52 代表每個檢測框
3代表每個檢測框有3個錨框,當符合該錨框時,後面的85維才會有數值,否則為0
xbboxes則僅將所有bbox按照順序存放:
sbboxes, mbboxes, lbboxes = bboxes_xywh
形狀:sbboxes為例:(150, 4)
最終返回的標籤:
(batch_smaller_target, batch_medium_target, batch_larger_target)
形狀:(((4, 52, 52, 3, 85), (4, 150, 4)), ((4, 26, 26, 3, 85), (4, 150, 4)), ((4, 13, 13, 3, 85), (4, 150, 4)))
其中:batch_xmaller_target = batch_label_xbbox, batch_xbboxes
形狀:((4, 52, 52, 3, 85), (4, 150, 4))
batch_label_xbbox = [num, *label_xbbox](num代表批次)
形狀:(4, 52, 52, 3, 85)
模型輸出資料:
形狀:[
(4, 52, 52, 255),
(4, 52, 52, 3, 85),
(4, 26, 26, 255),
(4, 26, 26, 3, 85),
(4, 13, 13, 255),
(4, 13, 13, 3, 85),
]輸出值每個尺度分為2組,一組為conv(?),一組為pred(?)
損失函式的計算:
conv_raw_conf = conv[:, :, :, :, 4:5] 原始置性度
conv_raw_prob = conv[:, :, :, :, 5:] 原始分類概率
label_prob = label[:, :, :, :, 5:] 真值分類概率pred_xywh = pred[:, :, :, :, 0:4] **框xywh
pred_conf = pred[:, :, :, :, 4:5] **置信度
label_xywh = label[:, :, :, :, 0:4] 真實框xywh
respond_bbox = label[:, :, :, :, 4:5] 真實置信度(判斷網格內有無物體)
giou的損失函式:
計算giou
giou = bbox_giou(pred_xywh, label_xywh)
計算giou的權重
bbox_loss_scale = 2 - 真實的w*h / 面積
bbox_loss_scale = 2.0 - 1.0 * label_xywh[:, :, :, :, 2:3] * label_xywh[:, :, :, :, 3:4] / (input_size ** 2)
最終得出giou_loss:
giou_loss = respond_bbox * bbox_loss_scale * (1- giou)
求均方值
giou_loss = tf.reduce_mean(tf.reduce_sum(giou_loss, axis=[1,2,3,4]))
tensorflow2實現線性回歸例子
tensorflow version 2.x import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers from tensorflow import initializers as...
基於tensorflow2實現卷積神經網路
利用tensorflow2中的api實現乙個簡單的卷積神經網路,完成梯度下降的操作並繪製訓練集和測試集準確率曲線。資料集在這裡 資料分布 訓練集數量為209,測試集數量為50 import numpy as np import matplotlib.pyplot as plt import tens...
tensorflow2的資料載入
對於一些小型常用的資料集,tensorflow有相關的api可以呼叫 keras.datasets 經典資料集 1 boston housing 波士頓房價 2 mnist fasion mnist 手寫數字集 時髦品集 3 cifar10 100 物象分類 4 imdb 電影評價 使用 tf.da...