pandas讀取檔案和open讀取檔案對比速度
阿新 • • 發佈:2021-01-23
技術標籤:python
pandas讀取檔案
starts=time.time()
for file in files[:1]:
print(file)
file_paths = os.path.join(root, file)
print(file_paths)
list2=[]
# with open(file_paths,'r') as f:
# for row in f:
# list2 = [row.split(',')[line] for row in f]
# #list2.append(row.split()[0])
# app_data[file]=list2
df_y = pd.read_csv(file_paths, engine='python')[data]
print(app_data)
end=time.time()
python open讀取檔案
line=int(data)
# x檔案有很多
for root, dirs, files in os.walk(train_file_x):
# 用第一個檔案作為例子
app_data = pd.DataFrame()
starts=time.time()
for file in files[:1]:
print(file)
file_paths = os.path.join(root, file)
print(file_paths)
list2=[]
with open(file_paths,'r') as f:
for row in f:
list2 = [row.split(',')[line] for row in f]
#list2.append(row.split()[0])
app_data[ file]=list2
#df_y = pd.read_csv(file_paths, engine='python')[data]
print(app_data)
對比結果:
pandas和python:open 讀取同樣的檔案,花費的效率相差 5倍
所以,提高效率的辦法就是用 python:open讀取csv檔案,然後在轉成DataFrame,如果之後需要用DataFrame的話。