徹底剖析numpy的資料型別

numpy中，array的許多生成函式預設使用的是float64資料型別：

>>> a = np.ones((3, 3))
>>> a.dtype
dtype('float64')

但是，對於傳入引數為list的構造方式，則會視情況而進行自動型別確認：

>>> a = np.array([1, 2, 3])
>>> a.dtype
dtype('int32')
>>> a = np.array([1., 2., 3.])
>>> a.dtype
dtype('float64')

生成後的array還會根據需要自動進行向上型別轉換：

>>> a = np.array([1, 2, 3])
>>> a = a + 1.5
>>> a.dtype
dtype('float64')

但是，賦值操作卻不會引起array的型別發生變化：

>>> a = np.array([1, 2, 3])
>>> a.dtype
dtype('int64')
>>> a[0] = 1.9
# <-- 浮點數1.9會被強制轉換為int64型別
>>> a
array([1, 2, 3])

強制型別轉換操作：

>>> a = np.array([1.7, 1.2, 1.6])
>>> b = a.astype(int)   # <-- 強制型別轉換
>>> b.dtype
dtype('int32')

四捨五入取整：

>>> a = np.array([1.2, 1.5, 1.6, 2.5, 3.5, 4.5])
>>> b = np.around(a)
>>> b.dtype     # 取整後得到的array仍然是浮點型別
dtype('float64')
>>> c = np.around(a).astype(int)
>>> c.dtype
dtype('int32')

資料型別

實際大小

int8

8bit

int16

16bit

int32

32bit

int64

64bit

>>> np.array([1], dtype=int).dtype
dtype('int64')
>>> np.iinfo(np.int32).max, 2**31 - 1
(2147483647, 2147483647)

資料型別

實際大小

uint8

8bit

uint16

16bit

uint32

32bit

uint64

64bit

>>> np.iinfo(np.uint32).max, 2**32 - 1
(4294967295, 4294967295)

資料型別

實際大小

float16

16bit

float32

32bit

float64

64bit

float96

96bit

float128

128bit

>>> np.finfo(np.float32).eps
1.1920929e-07
>>> np.finfo(np.float64).eps
2.2204460492503131e-16
>>> np.float32(1e-8) + np.float32(1) == 1
true
>>> np.float64(1e-8) + np.float64(1) == 1
false

資料型別

實際大小

complex64

2個32bit浮點數

complex128

2個64bit浮點數

complex192

2個96bit浮點數

complex256

2個128bit浮點數

所謂自定義結構體，有點像c/c++裡面的struct，下面演示乙個這樣的自定義結構體，它有3個資料元素：

元素名稱

資料型別

sensor_code

長度為4的字串

position

float

value

float

>>> samples = np.zeros((6,), dtype=[('sensor_code', 's4'),
...                                 ('position', float), ('value', float)])
>>> samples.ndim
1>>> samples.shape
(6,)
>>> samples.dtype.names
('sensor_code', 'position', 'value')
>>> samples[:] = [('alfa',   1, 0.37), ('beta', 1, 0.11), ('tau', 1,   0.13),
...               ('alfa', 1.5, 0.37), ('alfa', 3, 0.11), ('tau', 1.2, 0.13)]
>>> samples     
array([('alfa', 1.0, 0.37), ('beta', 1.0, 0.11), ('tau', 1.0, 0.13),
('alfa', 1.5, 0.37), ('alfa', 3.0, 0.11), ('tau', 1.2, 0.13)],
dtype=[('sensor_code', 's4'), ('position', '), ('value', ')])

使用索引下標：

>>> samples['sensor_code']    
array(['alfa', 'beta', 'tau', 'alfa', 'alfa', 'tau'],
dtype='|s4')
>>> samples['value']
array([ 0.37,  0.11,  0.13,  0.37,  0.11,  0.13])
>>> samples[0]    
('alfa', 1.0, 0.37)
>>> samples[0]['sensor_code'] = 'tau'
>>> samples[0]    
('tau', 1.0, 0.37)

一次性取出兩列資料：

>>> samples[['position', 'value']]
array([(1.0, 0.37), (1.0, 0.11), (1.0, 0.13), (1.5, 0.37), (3.0, 0.11),
(1.2, 0.13)],
dtype=[('position', '), ('value', ')])

還可以進行條件篩選操作：

samples[samples['sensor_code'] == 'alfa']    
array([('alfa', 1.5, 0.37), ('alfa', 3.0, 0.11)],
dtype=[('sensor_code', 's4'), ('position', '), ('value', ')])

在構造array時，傳進去乙個mask引數，可以標識某部分資料為缺失/非法值

>>> x = np.ma.array([1, 2, 3, 4], mask=[0, 1, 0, 1])
>>> x
masked_array(data = [1 -- 3 --],
mask = [false
true
false
true],
fill_value = 999999)
>>> y = np.ma.array([1, 2, 3, 4], mask=[0, 1, 1, 1])
>>> x + y
masked_array(data = [2 -- -- --],
mask = [false
true
true
true],
fill_value = 999999)

一般來說，mask型別資料常常產生於一些數**算：

>>> np.ma.sqrt([1, -1, 2, -2]) 
masked_array(data = [1.0 -- 1.41421356237... --],
mask = [false
true
false
true],
fill_value = 1e+20)

NumPy 資料型別

numpy 支援比 python 更多種類的數值型別。下表顯示了 numpy 中定義的不同標量資料型別。序號資料型別及描述 1.bool 儲存為乙個位元組的布林值真或假 2.int 預設整數，相當於 c 的long，通常為int32或int64 3.intc相當於 c 的int，通常為int32或...

NumPy 資料型別

numpy提供的數值型別，數值範圍比python提供的數值型別更大。numpy的數值型別，如下表所示 sn資料型別描述1 bool 布林值，取值ture false，占用乙個位元組 2int 是integer的預設型別。與c語言中的long型別相同，有可能是64位或32位。3intc 類似於c語言中...

Numpy資料型別

numpy是python的一種開源的數值計算擴充套件，是一些針對矩陣進行運算的模組。1.numpy介紹 2.numpy 學習筆記 3.python中的list和array的不同之處 4.python列表 numpy陣列與矩陣的區別 1.python中的list和np.array的不同之處 numpy...

徹底剖析numpy的資料型別

NumPy 資料型別

NumPy 資料型別

Numpy資料型別

相關推薦