pandas21 讀csv檔案read_csv(12.迭代和塊)(詳細 tcy)
阿新 • • 發佈:2018-12-29
例項-迭代2018/12/26
# 希望遍歷大檔案而不將整個檔案讀入記憶體指定chunksize逐塊讀取文字檔案
# read_csv或read_table返回值型別是可迭代物件TextFileReader
# 指定iterator=True也將返回TextFileReader物件
目錄: 第1部分:csv文字檔案讀寫 pandas 讀csv檔案read_csv(1.文字讀寫概要)https://mp.csdn.net/postedit/85289371 pandas 讀csv檔案read_csv(2.read_csv引數介紹)https://mp.csdn.net/postedit/85289928 pandas 讀csv檔案read_csv(3.dtypes指定列資料型別)https://mp.csdn.net/postedit/85290575 pandas 讀csv檔案read_csv(4.to_csv文字資料寫)https://mp.csdn.net/postedit/85290962 pandas 讀csv檔案read_csv(5.文字資料讀寫例項)https://mp.csdn.net/postedit/85291123 pandas 讀csv檔案read_csv(6.命名和使用列)https://mp.csdn.net/postedit/85291430 pandas 讀csv檔案read_csv(7.索引)https://mp.csdn.net/postedit/85291658 pandas 讀csv檔案read_csv(8.方言和分隔符)https://mp.csdn.net/postedit/85291994 pandas 讀csv檔案read_csv(9.浮點轉換和NA值)https://mp.csdn.net/postedit/85292391 pandas 讀csv檔案read_csv(10.註釋和空行)https://mp.csdn.net/postedit/85292609 pandas 讀csv檔案read_csv(11.日期時間處理) https://mp.csdn.net/postedit/85292925 pandas 讀csv檔案read_csv(12.迭代和塊)https://mp.csdn.net/postedit/85293639 pandas 讀csv檔案read_csv(13.read_fwf讀固定寬度資料)https://mp.csdn.net/postedit/85294010 第2部分: pandas hdf檔案讀寫簡要https://mp.csdn.net/postedit/85294299 pandas excel讀寫簡要https://mp.csdn.net/postedit/85294545 第3部分: python中csv模組用法tcy https://mp.csdn.net/postedit/85228189 pandas讀csv檔案read_csv錯誤解決辦法7種https://mp.csdn.net/postedit/85228808 pandas to_string用法https://mp.csdn.net/postedit/85294935
例項1:nrows讀取指定行數
data=' a b c key\n' \
'0 0 1 2 k1\n' \
'1 3 4 5 k1\n' \
'2 6 7 8 k2\n' \
'3 9 10 11 k3\n' \
'4 12 13 14 k3\n' \
'5 15 16 17 k3'
pd.read_csv(StringIO(data), sep='\s+',nrows=2,engine='python')#讀2行資料
a b c key
0 0 1 2 k1
1 3 4 5 k1
例項2:- 逐塊讀取檔案chunksize(行數)
chunker = pd.read_csv (StringIO(data), sep='\s+',engine='python', chunksize=2) for i in chunker: print(i) a b c key 0 0 1 2 k1 1 3 4 5 k1 a b c key 2 6 7 8 k2 3 9 10 11 k3 a b c key 4 12 13 14 k3 5 15 16 17 k3 # 例項2.2: chunker = pd.read_csv (StringIO(data), sep='\s+',engine='python', chunksize=2) chunker.get_chunk(3) a b c key 0 0 1 2 k1 1 3 4 5 k1 2 6 7 8 k2 chunker.get_chunk(3) a b c key 3 9 10 11 k3 4 12 13 14 k3 5 15 16 17 k3 chunker.get_chunk(3)#異常停止迭代
# 例項3:iterator=True迭代檔案
reader = pd.read_table(StringIO(data), sep='\s+',engine='python', iterator=True)
reader.get_chunk(2)#迭代獲得下2行資料
a b c key
0 0 1 2 k1
1 3 4 5 k1
for i in reader:
print(i)
a b c key
2 6 7 8 k2
3 9 10 11 k3
4 12 13 14 k3
5 15 16 17 k3