kaggle房價預測 第一次練習總結(第一個模型)
阿新 • • 發佈:2019-01-23
kaggle房價預測參考danB
連結:https://www.kaggle.com/learn/machine-learning
以下是用到的輸出
#print(original_data.isnull().sum()) #統計na的總數 #print(original_data.describe()) #顯示描述 #print(original_data.columns) #顯示列 #print(original_data.isnull()) #bool顯示是否為na #data_without_missing_values = original_data.dropna(axis=1) 刪除na #print(melbourne_price_data.head()) #預設是前五行 #print(mean_absolute_error(y,predicted_house_prices)) #差值計算
下面用隨機森林模型預測,隨機選擇了四個特徵來做,決策樹同隨機森林用法
import pandas as pd from sklearn.ensemble import RandomForestRegressor train = pd.read_csv( 'D:/NOTEBOOK/train.csv') #讀取train資料 train_y = train.SalePrice predictor_x = ['LotArea','YearBuilt','OverallQual','1stFlrSF','FullBath'] #特徵 train_x = train[predictor_x] my_model = RandomForestRegressor() #隨機森林模型 my_model.fit(train_x,train_y) #fit test = pd.read_csv( 'D:/NOTEBOOK/test.csv') #讀取test資料 test_x = test[predictor_x] pre_test_y = my_model.predict(test_x) print(pre_test_y) my_submission = pd.DataFrame({'Id':test.Id, 'SalePrice':pre_test_y}) #建csv my_submission.to_csv('submission2.csv', index=False)