用python做主成分分析(PCA)
阿新 • • 發佈:2018-12-10
import相關工具箱:
from sklearn.decomposition import PCA
from sklearn.preprocessing import normalize
L2-normalization(可選):
def l2_norm(data):
for i in range(data.shape[0]):
data[i, :] = normalize(data[i, :].reshape((1, -1)), norm='l2') # L2歸一化
return data
train_annotated_feature = l2_norm(train_annotated_feature)
train_candidates_feature = l2_norm(train_candidates_feature)
test_annotated_feature = l2_norm(test_annotated_feature)
test_candidates_feature = l2_norm(test_candidates_feature)
用train_annotated_feature資料訓練PCA模型,然後對其他資料進行主成分分析:
pca = PCA(n_components=300, copy=True, whiten=False) # 降維到300dimensions
pca.fit(train_annotated_feature)
pca_train_annotated_feature = pca.transform(train_annotated_feature)
pca_train_candidates_feature = pca.transform(train_candidates_feature)
pca_test _annotated_feature = pca.transform(test_annotated_feature)
pca_test_candidates_feature = pca.transform(test_candidates_feature)