python 資料標準化

def datastandard():
from sklearn import preprocessing
import numpy as np
x = np.array([
[ 1., -1.,  2.],
[ 2.,  0.,  0.],
[ 0.,  1., -1.]])
print('原始資料為：\n',x)
print('method1:指定均值方差資料標準化(預設均值0 方差 1):')
print('使用scale()函式 按列標準化')
x_scaled = preprocessing.scale(x)
print('標準化後矩陣為:\n',x_scaled,end='\n\n')
print('cur mean:', x_scaled.mean(axis=0), 'cur std:', x_scaled.std(axis=0))
print('使用scale()函式 按行標準化')
x_scaled = preprocessing.scale(x,axis=1)
print('標準化後矩陣為:\n',x_scaled,end='\n')
print('cur mean:', x_scaled.mean(axis=1), 'cur std:', x_scaled.std(axis=1))
print('\nmethod2:standardscaler類,可以儲存訓練集中的引數')
scaler = preprocessing.standardscaler().fit(x)
print('標準化前 均值方差為:',scaler.mean_,scaler.scale_)
print('標準化後矩陣為:\n',scaler.transform(x),end='\n\n')
print('***2.資料歸一化,對映到區間[min,max]：')
min_max_scaler = preprocessing.minmaxscaler(feature_range=(0,10))
print(min_max_scaler.fit_transform(x))
if __name__ == '__main__':
datastandard()

結果如下：

原始資料為：
[[ 1. -1.  2.]
[ 2.  0.  0.]
[ 0.  1. -1.]]
method1:指定均值方差資料標準化(預設均值0 方差 1):
使用scale()函式 按列標準化
標準化後矩陣為:
[[ 0.         -1.22474487  1.33630621]
[ 1.22474487  0.         -0.26726124]
[-1.22474487  1.22474487 -1.06904497]]
cur mean: [ 0.  0.  0.] cur std: [ 1.  1.  1.]
使用scale()函式 按行標準化
標準化後矩陣為:
[[ 0.26726124 -1.33630621  1.06904497]
[ 1.41421356 -0.70710678 -0.70710678]
[ 0.          1.22474487 -1.22474487]]
cur mean: [  1.48029737e-16   7.40148683e-17   0.00000000e+00] cur std: [ 1.  1.  1.]
method2:standardscaler類,可以儲存訓練集中的引數
標準化前 均值方差為: [ 1.          0.          0.33333333] [ 0.81649658  0.81649658  1.24721913]
標準化後矩陣為:
[[ 0.         -1.22474487  1.33630621]
[ 1.22474487  0.         -0.26726124]
[-1.22474487  1.22474487 -1.06904497]]
***2.資料歸一化,對映到區間[min,max]：
[[  5.           0.          10.        ]
[ 10.           5.           3.33333333]
[  0.          10.           0.        ]]

# coding:utf8
'''提取文件中含有某個字元的所有行，並列印出來
'''file_path = 'e:/gengyanpeng/keyun-bi.sql'
fix_str = 'from'
def print_line(txt,fix_str):
lines = txt.split('\n')
for line in lines:
if fix_str in line:
print(line.strip())
with open(file_path,'r+',encoding='utf8') as f:
text = f.read()
print_line(text,fix_str)

Python 資料標準化

定義將資料按照一定的比例進行縮放，使其落入乙個特定的區間。好處加快模型的收斂速度，提高模型精度常見的六種標準化方法 class datanorm def init self self.arr 1 2,3 4,5 6,7 8,9 self.x max max self.arr self.x m...

Python資料標準化

z score標準化 1.產生隨機數 import numpy as np 產生隨機數 data 1 np.random.randn 3,4 從標準正態分佈中返回乙個或多個樣本值.data 2 np.random.rand 3,4 產生 0,1 的數 print randn產生的隨機數 n data...

python中資料標準化

公式為 x mean std 計算時對每個屬性每列分別進行。將資料按期屬性按列進行減去其均值，並處以其方差。得到的結果是對於每個屬行每列來說所有資料都聚集在0附近，方差為1。實現時，有兩種不同的方式 1 sklearn.preprocessing.scale 函式，可以直接將給定資料進行標...

python 資料標準化

Python 資料標準化

Python資料標準化

python中資料標準化

相關推薦