資料預處理之歸一化(normalization)
阿新 • • 發佈:2019-02-15
概念介紹:
歸一化是利用特徵的最大最小值,將特徵的值縮放到[new_min,new_max]區間,對於每一列的特徵使用min-max函式進行縮放,計算公式如下
程式碼示例:
import numpy as np fromsklearn.preprocessing import MinMaxScaler,StandardScaler ### Machine LearningAction Chapter2 rewrite deffile2matrix(filename): data= np.genfromtxt(filename,delimiter="\t") returnMat=data[:,0:3] classLabelVector=data[:,3:4] return returnMat,classLabelVector defautoNorm(dataset): x = dataset[:, 0:1] #method1 用skit-learn封裝的MinMaxScaler處理 minMax = MinMaxScaler() x_std = minMax.fit_transform(x) print(x.min()) print(x.max()) print(x[2]) print((26052-0)/91273) print(x_std[2]) ##method2 use lambda a = lambda x: (x -x.min())/(x.max()-x.min()) print(a(x)[2]) if __name__ =='__main__': returnMat,classLabelVector=file2matrix('F:\\datingTestSet2.txt') autoNorm(returnMat)
執行結果:
資料集示意: