《街景字元編碼識別CV組隊學習》第2次打卡

一、資料讀取與資料擴增

影象讀取

pil和opencv

img=cv2.imread('cat.jpg')2.資料擴增

資料擴增是本次比賽的關鍵，在簡單擴增的情況下，訓練非常容易過擬合。嘗試增加更多的合適的擴增方法

3.pytorch 讀取資料

dataset是對資料集的封裝，提供索引讀取資料的方式

class svhndataset(dataset):
def __init__(self, img_path, img_label, transform=none):
self.img_path = img_path
self.img_label = img_label
if transform is not none:
self.transform = transform
else:
self.transform = none
def __getitem__(self, index):
img = image.open(self.img_path[index]).convert('rgb')
if self.transform is not none:
img = self.transform(img)
lbl = np.array(self.img_label[index], dtype=np.int)
lbl = list(lbl) + (5 - len(lbl)) * [10]
return img, torch.from_numpy(np.array(lbl[:5]))
def __len__(self):
return len(self.img_path)

dataloader是對dataset的封裝，提供迭代讀取方式

train_loader = torch.utils.data.dataloader(
svhndataset(train_path, train_label,
transforms.compose([
transforms.resize((64, 128)),
transforms.randomcrop((60, 120)),
transforms.colorjitter(0.3, 0.3, 0.2),
transforms.randomrotation(10),
transforms.totensor(),
transforms.normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])),
batch_size=40,
shuffle=true,
num_workers=1,
)

二、小結

pytorch提供的dataset類，dataloader類提供了方便的資料集操作，位於torch.utils.data下面。

num_workers在windows系統需要修改為0

《街景字元編碼識別CV組隊學習》第5次打卡

一學習內容模型整合 1.整合學習方法常見的方法包括stacking bagging boosting三類，與驗證集的劃分有關。比如10折交叉驗證。2.深度學習中的整合方法 a.dropout 一般放在relu之後，用於緩解過擬合，目前已不常用 b.測試集資料擴增 tta test time a...

《街景字元編碼識別CV組隊學習》第一次打卡

一賽題理解 1.資料集訓練集資料報括3w張驗證集資料報括1w張每張包括顏色影象和對應的編碼類別和具體位置 2.資料標籤標籤檔案是.json格式，top,height,left,width,label 同一張可能有多個數字，結果需要考慮按x座標公升序排列 4.實踐 json格式資料讀取 ...

零基礎入門CV賽事街景字元編碼識別

pytorch讀取資料使用定長字元識別思路構建模型學習python和pytorch中影象讀取學會擴增方法和pytorch讀取賽題資料 pillow 匯入 from pil import image 讀取 im image.open jpg opencv 匯入 import cv2 讀取 img...

《街景字元編碼識別CV組隊學習》第2次打卡

《街景字元編碼識別CV組隊學習》第5次打卡

《街景字元編碼識別CV組隊學習》第一次打卡

零基礎入門CV賽事 街景字元編碼識別

相關推薦

零基礎入門CV賽事街景字元編碼識別