OpenCV之光學字元識別文字分割演算法（十一）

一，預處理，對影象進行閥值處理，消除所有顏色資訊。

mat binarize
(mat input)

otsu方法使類間方差最大化。

二，文字分割，

1，使用連通分量分析：搜尋影象中連貫的畫素組。

第一步建立連貫區域，使用擴張形態學運算元，膨脹讓影象元素更厚。

首先建立形態學3x3交叉核心，以上核心應用5次膨脹，將所有字母粘合在一起。

第二步識別段落塊，執行連通元件分析查詢與段落對應的塊。

檢索外部輪廓並使用簡單近似。

第三步確定每個輪廓的最小旋轉邊界矩形。

矩形寬高小於20個畫素丟棄，寬高比小於2，同時需要考慮旋轉的邊界框。

vector
findtextareas
(mat input)
return areas;
}

三，文字提取和偏斜調整

mat deskewandcrop
(mat input,
const rotatedrect& box)
//rotate the text according to the angle
auto transform =
getrotationmatrix2d
(box.center, angle,
1.0)
;	mat rotated;
warpaffine
(input, rotated, transform, input.
size()
, inter_cubic)
;//crop the result
mat cropped;
getrectsubpix
(rotated, size, box.center, cropped)
;copymakeborder
(cropped,cropped,10,
10,10,
10,border_constant,
scalar(0
));return cropped;
}

角度小於-45度時文字垂直，旋轉角度增加90度，切換寬度和高度。

//描述旋轉的2d仿射變換矩陣
mat getrotationmatrix2d
( point2f center,
double angle,
double scale )
;

//旋**身
void
warpaffine
( inputarray src, outputarray dst,
inputarray m, size dsize,
int flags = inter_linear,
int bordermode = border_constant,
const scalar& bordervalue =
scalar()
);

flags表示影象應如何插值，bicubic_interpolation提高質量，預設值為linear_interpolation。

bordermode邊框模式。

scalar邊框顏色。

//裁剪邊界框的矩形區域
void
getrectsubpix
( inputarray image, size patchsize,
point2f center, outputarray patch,
int patchtype =-1
);

//影象周圍新增邊框
void
copymakeborder
(inputarray src, outputarray dst,
int top,
int bottom,
int left,
int right,
int bordertype,
const scalar& value =
scalar()
);

分類階段需要在文字周圍留出邊緣。

整體呼叫

int
main
(int argc,
char
* ar**)
waitkey(0
);return0;
}

OCR光學字元識別

沒搞過計算機視覺，只好拿來主義了根據網上的推薦 1.google vision，識別效果還不錯，收費，而且對於複雜文字也不是很理想，所以繼續尋找開源 2.tesseract，最負盛名的開源識別軟體，據說google vision內部也是基於此。就我們公司的需求來說提取中嵌入的文字 tessera...

開源OCR光學字元識別

優秀的開源ocr軟體包括 tesseract 原本由惠普開發的影象識別類庫tesseract ocr已經更新到2.04，就是最近google支援的那個ocr。原先是惠普寫的，現在open source了。ocropus ocropus的 tm 是乙個先進的檔案分析和ocr系統，採用可插入的布局分析，...

OpenCV之光學字元識別 文字分割演算法（十一）

OCR光學字元識別

開源OCR光學字元識別

開源OCR光學字元識別

相關推薦

OpenCV之光學字元識別文字分割演算法（十一）