預處理後資料的儲存與讀取
阿新 • • 發佈:2018-11-20
在機器學習中,一般都需要先對資料進行資料預處理工作。模型一般需要反覆的調參,因此可能需要多次使用預處理之後的資料,但是反覆進行資料的預處理工作是多餘的,我們可以將其儲存下來。
#用pickle模組將處理好的資料儲存成pickle格式,方便以後呼叫,即建立一個checkpoint # 儲存資料方便呼叫 import os import pickle pickle_file = 'notMNIST.pickle' if not os.path.isfile(pickle_file): #判斷是否存在此檔案,若無則儲存 print('Saving data to pickle file...') try: with open('fan.pickle', 'wb') as pfile: pickle.dump( { 'X_train': X_train, 'X_test': X_test, 'Ytrain': y_train, 'y_test': y_test, }, pfile, pickle.HIGHEST_PROTOCOL)except Exception as e: print('Unable to save data to', pickle_file, ':', e) raise print('Data cached in pickle file.')
#從pickle檔案中讀取資料 pickle_file = 'pickle.pickle' with open(pickle_file, 'rb') as f: pickle_data = pickle.load(f) # 反序列化,與pickle.dump相反 X_train = pickle_data['X_train'] X_test = pickle_data['X_test'] y_train = pickle_data['y_train'] y_test = pickle_data['y_test'] del pickle_data # 釋放記憶體 print('Data and modules loaded.')