（深度學習）GPU比CPU慢？快看這裡！

gpu由於擅長矩陣運算，在深度學習尤其是計算機視覺方面得到了廣泛的應用。

前幾天在我廢了好大勁在我的的電腦上安裝了tensorflow 2.0 - gpu，然後就迫不及待地去體驗一下gpu的速度。

我去tensorflow官網上直接複製了一段**，就是最簡單的神經網路識別mnist手寫數字資料集。然後分別用gpu和cpu跑了以下，結果讓我大吃一驚。之前聽別人說用gpu通常會比cpu快好幾倍，而我經過嘗試發現gpu竟然比cpu還要慢了好多！

經過請教別人和上網查資料得出結論：是因為模型規模過小，沒有體現出gpu的優勢。

下面先看一下我的電腦的cpu和gpu的配置：

硬體型號

cpu第六代英特爾酷睿i5-6200u處理器

gpunvidia geforce 940m

下面看**。大家可以跑一下試試（不同硬體配置結果可能不同）

#tensorflow and tf.keras
import tensorflow as tf
#helper libraries
import numpy as np
import matplotlib.pyplot as plt
from time import time
mnist = tf.keras.datasets.mnist
(x_train, y_train)
,(x_test, y_test)
= mnist.load_data(
)x_train, x_test = x_train /
255.0
, x_test /
255.0
#用cpu運算
starttime1 = time(
)with tf.device(
'/cpu:0'):
model = tf.keras.models.sequential(
[      tf.keras.layers.flatten(input_shape=(28
,28))
,      tf.keras.layers.dense(
128, activation=
'relu'),
tf.keras.layers.dropout(
0.2)
,      tf.keras.layers.dense(
10, activation=
'softmax')]
)    model.
compile
(optimizer=
'adam'
,                  loss=
'sparse_categorical_crossentropy'
,                  metrics=
['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t1 = time(
)- starttime1
#用gpu運算
starttime2 = time(
)with tf.device(
'/gpu:0'):
model = tf.keras.models.sequential(
[      tf.keras.layers.flatten(input_shape=(28
,28))
,      tf.keras.layers.dense(
128, activation=
'relu'),
tf.keras.layers.dropout(
0.2)
,      tf.keras.layers.dense(
10, activation=
'softmax')]
)    model.
compile
(optimizer=
'adam'
,                  loss=
'sparse_categorical_crossentropy'
,                  metrics=
['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t2 = time(
)- starttime2
#列印執行時間
print
(, t1)
print
(, t2)

結果：

gpu比cpu慢的原因大致為：

資料傳輸會有很大的開銷，而gpu處理資料傳輸要比cpu慢，而gpu的專長矩陣計算在小規模神經網路中無法明顯體現出來。

#tensorflow and tf.keras
import tensorflow as tf
#helper libraries
import numpy as np
import matplotlib.pyplot as plt
from time import time
mnist = tf.keras.datasets.mnist
(x_train, y_train)
,(x_test, y_test)
= mnist.load_data(
)x_train, x_test = x_train /
255.0
, x_test /
255.0
#cpu執行
starttime1 = time(
)with tf.device(
'/cpu:0'):
model = tf.keras.models.sequential(
[      tf.keras.layers.flatten(input_shape=(28
,28))
,      tf.keras.layers.dense(
1000
, activation=
'relu'),
tf.keras.layers.dropout(
0.2)
,      tf.keras.layers.dense(
1000
, activation=
'relu'),
tf.keras.layers.dropout(
0.2)
,      tf.keras.layers.dense(
10, activation=
'softmax')]
)    model.
compile
(optimizer=
'adam'
,                  loss=
'sparse_categorical_crossentropy'
,                  metrics=
['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t1 = time(
)- starttime1
#gpu執行
starttime2 = time(
)with tf.device(
'/gpu:0'):
model = tf.keras.models.sequential(
[      tf.keras.layers.flatten(input_shape=(28
,28))
,      tf.keras.layers.dense(
1000
, activation=
'relu'),
tf.keras.layers.dropout(
0.2)
,      tf.keras.layers.dense(
1000
, activation=
'relu'),
tf.keras.layers.dropout(
0.2)
,      tf.keras.layers.dense(
10, activation=
'softmax')]
)    model.
compile
(optimizer=
'adam'
,                  loss=
'sparse_categorical_crossentropy'
,                  metrics=
['accuracy'])
model.fit(x_train, y_train, epochs=10)
model.evaluate(x_test, y_test)
t2 = time(
)- starttime2
#列印執行時間
print
(, t1)
print
(, t2)

結果：

以上，希望能給大家帶來幫助！

（深度學習）為什麼GPU比CPU慢？

gpu由於擅長矩陣運算，在深度學習尤其是計算機視覺方面得到了廣泛的應用。前幾天在我廢了好大勁在我的的電腦上安裝了tensorflow 2.0 gpu，然後就迫不及待地去體驗一下gpu的速度。我去tensorflow官網上直接複製了一段就是最簡單的神經網路識別mnist手寫數字資料集。然後分別用gp...

FPGA為什麼比CPU和GPU快

2018 03 05 11 28 cpu和gpu都屬於馮諾依曼結構，指令解碼執行，共享記憶體。fpga之所以比cpu gpu更快，本質上是因為其無指令，無共享記憶體的體系結構所決定的。馮氏結構中，由於執行單元可能執行任意指令，就需要有指令儲存器解碼器各種指令的運算器分支跳轉處理邏輯。而fpg...

深度學習之選擇GPU或CPU方法

import os os.environ cuda device order pci bus id os.environ cuda visible devices 1 上邊表示使用第二塊gpu執行程式，如果要使用多塊，如第一塊和第三塊，可使用如下方法指定 os.environ cuda visibl...

（深度學習）GPU比CPU慢？快看這裡！

（深度學習）為什麼GPU比CPU慢？

FPGA為什麼比CPU和GPU快

深度學習之選擇GPU或CPU方法

相關推薦