ROI Align原理及cuda原始碼閱讀

具體可參考：

詳解 roi align 的基本原理和實現細節。這篇文章為整體的原理理解，並不涉及演算法的具體實現。簡單看。

雙線性插值演算法的詳細總結。這篇文章涉及到演算法的細節。就x,y點值的計算需要理解。重點理解以下公式：

這裡將總體流程寫出來，並在程式中的相應位置有相應注釋。仔細樹立一下，理解還是比較容易的。

總體流程：

具體程式

template 
__device__ t bilinear_interpolate(const t* bottom_data,
const int height, const int width,
t y, t x,
const int index /
* index for debug only*/)
if(y <=
0) y =0;
if(x <=
0) x =0;
//1. f(0,
0),f(0,1
)分別是x,y的向下取整，f(1,
0),f(1,1
)理論上分別f(0,
0),f(0,1
)+1int y_low =
(int
) y;
// 向下取整
int x_low =
(int
) x;
int y_high;
int x_high;
if(y_low >= height -1)
else
if(x_low >= width -1)
else//2
. 利用中的公式進行雙線性插值求得f(x,y)
t ly = y - y_low;
t lx = x - x_low;
t hy =1.
- ly, hx =1.
- lx;
// do bilinear interpolation
t v1 = bottom_data[y_low * width + x_low]
;// 提取4個點的值
t v2 = bottom_data[y_low * width + x_high]
;  t v3 = bottom_data[y_high * width + x_low]
;  t v4 = bottom_data[y_high * width + x_high]
;  t w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
t val =
(w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4)
;return val;
}template 
__global__ void roialignforward(const int nthreads, const t* bottom_data,
const t spatial_scale, const int channels,
const int height, const int width,
const int pooled_height, const int pooled_width,
const int sampling_ratio,
const t* bottom_rois, t* top_data)
}    output_val /= count;
top_data[index]
= output_val;
}}

ROI Align 原理理解

對背景問題的理解之前一直在想乙個問題乙個label在原圖上標記出乙個包含目標的區域。這個框在特徵提取後，大小被縮小到了什麼程度？如果這個label框本身就不大，那麼經過幾層池化之後，是不是在最後的feature map上都沒有乙個位置，能夠對應到這個區域？目標在特徵提取過程中，由於這種深度結構導...

ROI Align 原理理解

theano及cuda環境搭建

最近剛剛配置了deep learning的環境，折騰了兩三天，查閱了很多資料，各種嘗試，終於成功了，下面把搭建過程詳細記錄下來，希望給自己和有需要的人提供更多參考和幫助。系統 win7 顯示卡 nvidia geforce gt705 theano安裝依賴資源有 anaconda，資源這裡簡要介...

ROI Align原理及cuda原始碼閱讀

ROI Align 原理理解

ROI Align 原理理解

theano及cuda環境搭建

相關推薦