Python 資料標準化

定義：將資料按照一定的比例進行縮放，使其落入乙個特定的區間。

好處：加快模型的收斂速度，提高模型**精度

常見的六種標準化方法：

class
datanorm
:   
def __init__
(self)
:       self.arr =[1
,2,3
,4,5
,6,7
,8,9
]       self.x_max =
max(self.arr)
self.x_min =
min(self.arr)
self.x_mean =
sum(self.arr)
/len
(self.arr)
self.x_std = np.
std(self.arr) #標準差
def min_max
(self)
:       arr_ =
list()
for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_.
(round
((x - self.x_min)
/(self.x_max - self.x_min),4
))print
("經過min_max標準化後的資料為：\n{}"
.format
(arr_)
)   
def z_score
(self)
:       arr_ =
list()
for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_.
(round
((x - self.x_mean)
/self.x_std,4)
)print
("經過z_score標準化後的資料為：\n{}"
.format
(arr_)
)       
# 小數定標標準化
def decimascaling
(self)
:       arr_ =
list()
j =1       x_max =
max(
[abs
(i)for i in self.arr]
) #求絕對值最大的數
# 判斷該數的位數
while x_max/
10>=
1.0:
j +=
1           x_max /=
10for x in self.arr:
arr_.
(round
(x / math.
pow(
10, j),4
))print
("經過decimascaling標準化後的資料為：\n{}"
.format
(arr_)
)   
#均值歸一化法
def mean
(self)
:       arr_ =
list()
for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_.
(round
((x - self.x_mean)
/(self.x_max - self.x_min),4
))print
("經過mean標準化後的資料為：\n{}"
.format
(arr_)
)   
# 向量歸一化法
def vector
(self)
:       arr_ =
list()
for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_.
(round
( x/
sum(self.arr),4
))print
("經過vector標準化後的資料為：\n{}"
.format
(arr_)
)       
# 指數轉換法
def exponential
(self)
:       arr_1 =
list()
for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_1.
(round
(math.
log10
(x)/math.
log10
(self.x_max),4
))print
("經過指數轉換法（log10）標準化後的資料為：\n{}"
.format
(arr_1)
)       
arr_2 =
list()
sum_e =
sum(
[math.
exp(one)
for one in self.arr]
)for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_2.
(round
(math.
exp(x)
/sum_e,4)
)print
("經過指數轉換法（softmax）標準化後的資料為：\n{}"
.format
(arr_2)
)       
arr_3 =
list()
for x in self.arr:
# round
(x,4
) 對x保留4位小數
arr_3.
(round(1
/(1+math.
exp(
-x)),4
))print
("經過指數轉換法（sigmoid）標準化後的資料為：\n{}"
.format
(arr_3)
)ob =
datanorm()
ob.min_max()
ob.z_score()
ob.decimascaling()
ob.mean()
ob.vector()
ob.exponential
()

經過min_max標準化後的資料為： [0.0 ,0.125 ,0.25 ,0.375 ,0.5 ,0.625 ,0.75 ,0.875 ,1.0 ]經過z_score標準化後的資料為：[- 1.5492,- 1.1619,- 0.7746,- 0.3873 ,0.0 ,0.3873 ,0.7746 ,1.1619 ,1.5492 ]經過decimascaling標準化後的資料為： [0.1 ,0.2 ,0.3 ,0.4 ,0.5 ,0.6 ,0.7 ,0.8 ,0.9 ]經過mean標準化後的資料為：[- 0.5, -0.375,- 0.25,- 0.125 ,0.0 ,0.125 ,0.25 ,0.375 ,0.5 ]經過vector標準化後的資料為： [0.0222 ,0.0444 ,0.0667 ,0.0889 ,0.1111 ,0.1333 ,0.1556 ,0.1778 ,0.2 ]經過指數轉換法（log10）標準化後的資料為： [0.0 ,0.3155 ,0.5 ,0.6309 ,0.7325 ,0.8155 ,0.8856 ,0.9464 ,1.0 ]經過指數轉換法（softmax）標準化後的資料為： [0.0002 ,0.0006 ,0.0016 ,0.0043 ,0.0116 ,0.0315 ,0.0856 ,0.2326 ,0.6322 ]經過指數轉換法（sigmoid）標準化後的資料為： [0.7311 ,0.8808 ,0.9526 ,0.982 ,0.9933 ,0.9975 ,0.9991 ,0.9997 ,0.9999

]

Python 資料標準化

python 資料標準化

Python資料標準化

python中資料標準化

Python 資料標準化

python 資料標準化

Python資料標準化

python中資料標準化

相關推薦