Pandas處理csv英國降雨資料
阿新 • • 發佈:2019-01-13
文章目錄
匯入資料
英國降雨資料:http://data.defra.gov.uk/statistics_2015/env/water/uk_rain_2014.csv
import pandas as pd
#匯入資料
#uk_rain_2014.csv 第一行是標籤,可以做列索引
df=pd.read_csv('C:\\Users\\cc\\Desktop\\Pandas\\uk_rain_2014.csv' ,header=0)#預設引數header也為0,表示以第一行為列索引
pandas.read_csv()讀取csv檔案資料到dataframe
注意:csv中的資料都是用逗號隔開的。
引數:
filepath_or_buffer :字串、或者任何物件的read()方法。這個字串可以是URL,有效的URL方案包括http、ftp、s3和檔案。可以直接寫入"檔名.csv"
sep:分隔符,預設是‘,’,CSV檔案的分隔符
header:列名(列索引),預設第一行為列名(預設header=0)
header=None,說明第一行不是列名。這樣,它會給新的列名:0,1,2,3,4…
可以給加上新列名,見另一個引數
names:當csv檔案沒有列名時候,可以用names加上要用的列名
index_col:要用的行名(index),int或sequence或False,預設為None,即預設新增從0開始的index
若要用第一列作為行索引,寫index_col=0
print(df)
Water Year Rain (mm) Oct-Sep Outflow (m3/s) Oct-Sep Rain (mm) Dec-Feb \ 0 1980/81 1182 5408 292 1 1981/82 1098 5112 257 2 1982/83 1156 5701 330 3 1983/84 993 4265 391 4 1984/85 1182 5364 217 5 1985/86 1027 4991 304 6 1986/87 1151 5196 295 7 1987/88 1210 5572 343 8 1988/89 976 4330 309 9 1989/90 1130 4973 470 10 1990/91 1022 4418 305 11 1991/92 1151 4506 246 12 1992/93 1130 5246 308 13 1993/94 1162 5583 422 14 1994/95 1110 5370 484 15 1995/96 856 3479 245 16 1996/97 1047 4019 258 17 1997/98 1169 4953 341 18 1998/99 1268 5824 360 19 1999/00 1204 5665 417 20 2000/01 1239 6092 328 21 2001/02 1185 5402 380 22 2002/03 1021 4366 272 23 2003/04 1165 4275 348 24 2004/05 1095 4547 309 25 2005/06 1046 4059 206 26 2006/07 1387 6391 437 27 2007/08 1225 5497 386 28 2008/09 1139 4941 268 29 2009/10 1103 4738 255 30 2010/11 1053 4521 265 31 2011/12 1285 5500 339 32 2012/13 1090 5329 350 Outflow (m3/s) Dec-Feb Rain (mm) Jun-Aug Outflow (m3/s) Jun-Aug 0 7248 174 2212 1 7316 242 1936 2 8567 124 1802 3 8905 141 1078 4 5813 343 4313 5 7951 229 2595 6 7593 267 2826 7 8456 294 3154 8 6465 200 1440 9 10520 209 1740 10 7120 216 1923 11 5493 280 2118 12 8751 219 2551 13 10109 193 1638 14 11486 103 1231 15 5515 172 1439 16 5770 256 2102 17 7747 285 3206 18 8771 225 2240 19 10021 197 2166 20 9347 236 2142 21 8891 259 3187 22 7093 176 1478 23 7493 315 2959 24 7183 217 1799 25 4578 188 1474 26 10926 357 5168 27 9485 320 3505 28 6690 323 3189 29 6435 244 1958 30 6593 267 2885 31 7630 379 5261 32 9615 187 1797
df.index
RangeIndex(start=0, stop=33, step=1)
df.columns
Index([u'Water Year', u'Rain (mm) Oct-Sep', u'Outflow (m3/s) Oct-Sep',
u'Rain (mm) Dec-Feb', u'Outflow (m3/s) Dec-Feb', u'Rain (mm) Jun-Aug',
u'Outflow (m3/s) Jun-Aug'],
dtype='object')
df.values
array([['1980/81', 1182L, 5408L, 292L, 7248L, 174L, 2212L],
['1981/82', 1098L, 5112L, 257L, 7316L, 242L, 1936L],
['1982/83', 1156L, 5701L, 330L, 8567L, 124L, 1802L],
['1983/84', 993L, 4265L, 391L, 8905L, 141L, 1078L],
['1984/85', 1182L, 5364L, 217L, 5813L, 343L, 4313L],
['1985/86', 1027L, 4991L, 304L, 7951L, 229L, 2595L],
['1986/87', 1151L, 5196L, 295L, 7593L, 267L, 2826L],
['1987/88', 1210L, 5572L, 343L, 8456L, 294L, 3154L],
['1988/89', 976L, 4330L, 309L, 6465L, 200L, 1440L],
['1989/90', 1130L, 4973L, 470L, 10520L, 209L, 1740L],
['1990/91', 1022L, 4418L, 305L, 7120L, 216L, 1923L],
['1991/92', 1151L, 4506L, 246L, 5493L, 280L, 2118L],
['1992/93', 1130L, 5246L, 308L, 8751L, 219L, 2551L],
['1993/94', 1162L, 5583L, 422L, 10109L, 193L, 1638L],
['1994/95', 1110L, 5370L, 484L, 11486L, 103L, 1231L],
['1995/96', 856L, 3479L, 245L, 5515L, 172L, 1439L],
['1996/97', 1047L, 4019L, 258L, 5770L, 256L, 2102L],
['1997/98', 1169L, 4953L, 341L, 7747L, 285L, 3206L],
['1998/99', 1268L, 5824L, 360L, 8771L, 225L, 2240L],
['1999/00', 1204L, 5665L, 417L, 10021L, 197L, 2166L],
['2000/01', 1239L, 6092L, 328L, 9347L, 236L, 2142L],
['2001/02', 1185L, 5402L, 380L, 8891L, 259L, 3187L],
['2002/03', 1021L, 4366L, 272L, 7093L, 176L, 1478L],
['2003/04', 1165L, 4275L, 348L, 7493L, 315L, 2959L],
['2004/05', 1095L, 4547L, 309L, 7183L, 217L, 1799L],
['2005/06', 1046L, 4059L, 206L, 4578L, 188L, 1474L],
['2006/07', 1387L, 6391L, 437L, 10926L, 357L, 5168L],
['2007/08', 1225L, 5497L, 386L, 9485L, 320L, 3505L],
['2008/09', 1139L, 4941L, 268L, 6690L, 323L, 3189L],
['2009/10', 1103L, 4738L, 255L, 6435L, 244L, 1958L],
['2010/11', 1053L, 4521L, 265L, 6593L, 267L, 2885L],
['2011/12', 1285L, 5500L, 339L, 7630L, 379L, 5261L],
['2012/13', 1090L, 5329L, 350L, 9615L, 187L, 1797L]], dtype=object)
#想知道一些基本統計資訊
df.describe()
Rain (mm) Oct-Sep | Outflow (m3/s) Oct-Sep | Rain (mm) Dec-Feb | Outflow (m3/s) Dec-Feb | Rain (mm) Jun-Aug | Outflow (m3/s) Jun-Aug | |
---|---|---|---|---|---|---|
count | 33.000000 | 33.000000 | 33.000000 | 33.000000 | 33.000000 | 33.000000 |
mean | 1129.000000 | 5019.181818 | 325.363636 | 7926.545455 | 237.484848 | 2439.757576 |
std | 101.900074 | 658.587762 | 69.995008 | 1692.800049 | 66.167931 | 1025.914106 |
min | 856.000000 | 3479.000000 | 206.000000 | 4578.000000 | 103.000000 | 1078.000000 |
25% | 1053.000000 | 4506.000000 | 268.000000 | 6690.000000 | 193.000000 | 1797.000000 |
50% | 1139.000000 | 5112.000000 | 309.000000 | 7630.000000 | 229.000000 | 2142.000000 |
75% | 1182.000000 | 5497.000000 | 360.000000 | 8905.000000 | 280.000000 | 2959.000000 |
max | 1387.000000 | 6391.000000 | 484.000000 | 11486.000000 | 379.000000 | 5261.000000 |
df.count()#查詢每個列的非空值的數量
df['Outflow (m3/s) Oct-Sep'].mean()#查詢某個列的均值
5019.181818181818
df.std()#查每個列的標準差
df.median()#查每列的中值
#按index(行標籤、列標籤)排序 預設升序
df.sort_index(axis=0, ascending=False)#axis=0 按行標籤 ascending=False降序
Water Year | Rain (mm) Oct-Sep | Outflow (m3/s) Oct-Sep | Rain (mm) Dec-Feb | Outflow (m3/s) Dec-Feb | Rain (mm) Jun-Aug | Outflow (m3/s) Jun-Aug | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
5 | 1985/86 | 1027 | 4991 | 304 | 7951 | 229 | 2595 |
6 | 1986/87 | 1151 | 5196 | 295 | 7593 | 267 | 2826 |
7 | 1987/88 | 1210 | 5572 | 343 | 8456 | 294 | 3154 |
8 | 1988/89 | 976 | 4330 | 309 | 6465 | 200 | 1440 |
9 | 1989/90 | 1130 | 4973 | 470 | 10520 | 209 | 1740 |
10 | 1990/91 | 1022 | 4418 | 305 | 7120 | 216 | 1923 |
11 | 1991/92 | 1151 | 4506 | 246 | 5493 | 280 | 2118 |
12 | 1992/93 | 1130 | 5246 | 308 | 8751 | 219 | 2551 |
13 | 1993/94 | 1162 | 5583 | 422 | 10109 | 193 | 1638 |
14 | 1994/95 | 1110 | 5370 | 484 | 11486 | 103 | 1231 |
15 | 1995/96 | 856 | 3479 | 245 | 5515 | 172 | 1439 |
16 | 1996/97 | 1047 | 4019 | 258 | 5770 | 256 | 2102 |
17 | 1997/98 | 1169 | 4953 | 341 | 7747 | 285 | 3206 |
18 | 1998/99 | 1268 | 5824 | 360 | 8771 | 225 | 2240 |
19 | 1999/00 | 1204 | 5665 | 417 | 10021 | 197 | 2166 |
20 | 2000/01 | 1239 | 6092 | 328 | 9347 | 236 | 2142 |
21 | 2001/02 | 1185 | 5402 | 380 | 8891 | 259 | 3187 |
22 | 2002/03 | 1021 | 4366 | 272 | 7093 | 176 | 1478 |
23 | 2003/04 | 1165 | 4275 | 348 | 7493 | 315 | 2959 |
24 | 2004/05 | 1095 | 4547 | 309 | 7183 | 217 | 1799 |
25 | 2005/06 | 1046 | 4059 | 206 | 4578 | 188 | 1474 |
26 | 2006/07 | 1387 | 6391 | 437 | 10926 | 357 | 5168 |
27 | 2007/08 | 1225 | 5497 | 386 | 9485 | 320 | 3505 |
28 | 2008/09 | 1139 | 4941 | 268 | 6690 | 323 | 3189 |
29 | 2009/10 | 1103 | 4738 | 255 | 6435 | 244 | 1958 |
30 | 2010/11 | 1053 | 4521 | 265 | 6593 | 267 | 2885 |
31 | 2011/12 | 1285 | 5500 | 339 | 7630 | 379 | 5261 |
32 | 2012/13 | 1090 | 5329 | 350 | 9615 | 187 | 1797 |
df.sort_index(axis=1, ascending=False) #axis=1:按列標籤 降序
Water Year | Rain (mm) Oct-Sep | Rain (mm) Jun-Aug | Rain (mm) Dec-Feb | Outflow (m3/s) Oct-Sep | Outflow (m3/s) Jun-Aug | Outflow (m3/s) Dec-Feb | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 174 | 292 | 5408 | 2212 | 7248 |
1 | 1981/82 | 1098 | 242 | 257 | 5112 | 1936 | 7316 |
2 | 1982/83 | 1156 | 124 | 330 | 5701 | 1802 | 8567 |
3 | 1983/84 | 993 | 141 | 391 | 4265 | 1078 | 8905 |
4 | 1984/85 | 1182 | 343 | 217 | 5364 | 4313 | 5813 |
5 | 1985/86 | 1027 | 229 | 304 | 4991 | 2595 | 7951 |
6 | 1986/87 | 1151 | 267 | 295 | 5196 | 2826 | 7593 |
7 | 1987/88 | 1210 | 294 | 343 | 5572 | 3154 | 8456 |
8 | 1988/89 | 976 | 200 | 309 | 4330 | 1440 | 6465 |
9 | 1989/90 | 1130 | 209 | 470 | 4973 | 1740 | 10520 |
10 | 1990/91 | 1022 | 216 | 305 | 4418 | 1923 | 7120 |
11 | 1991/92 | 1151 | 280 | 246 | 4506 | 2118 | 5493 |
12 | 1992/93 | 1130 | 219 | 308 | 5246 | 2551 | 8751 |
13 | 1993/94 | 1162 | 193 | 422 | 5583 | 1638 | 10109 |
14 | 1994/95 | 1110 | 103 | 484 | 5370 | 1231 | 11486 |
15 | 1995/96 | 856 | 172 | 245 | 3479 | 1439 | 5515 |
16 | 1996/97 | 1047 | 256 | 258 | 4019 | 2102 | 5770 |
17 | 1997/98 | 1169 | 285 | 341 | 4953 | 3206 | 7747 |
18 | 1998/99 | 1268 | 225 | 360 | 5824 | 2240 | 8771 |
19 | 1999/00 | 1204 | 197 | 417 | 5665 | 2166 | 10021 |
20 | 2000/01 | 1239 | 236 | 328 | 6092 | 2142 | 9347 |
21 | 2001/02 | 1185 | 259 | 380 | 5402 | 3187 | 8891 |
22 | 2002/03 | 1021 | 176 | 272 | 4366 | 1478 | 7093 |
23 | 2003/04 | 1165 | 315 | 348 | 4275 | 2959 | 7493 |
24 | 2004/05 | 1095 | 217 | 309 | 4547 | 1799 | 7183 |
25 | 2005/06 | 1046 | 188 | 206 | 4059 | 1474 | 4578 |
26 | 2006/07 | 1387 | 357 | 437 | 6391 | 5168 | 10926 |
27 | 2007/08 | 1225 | 320 | 386 | 5497 | 3505 | 9485 |
28 | 2008/09 | 1139 | 323 | 268 | 4941 | 3189 | 6690 |
29 | 2009/10 | 1103 | 244 | 255 | 4738 | 1958 | 6435 |
30 | 2010/11 | 1053 | 267 | 265 | 4521 | 2885 | 6593 |
31 | 2011/12 | 1285 | 379 | 339 | 5500 | 5261 | 7630 |
32 | 2012/13 | 1090 | 187 | 350 | 5329 | 1797 | 9615 |
#按值排序
df.sort_values(by='Outflow (m3/s) Jun-Aug')#按某一列的資料排
Water Year | Rain (mm) Oct-Sep | Outflow (m3/s) Oct-Sep | Rain (mm) Dec-Feb | Outflow (m3/s) Dec-Feb | Rain (mm) Jun-Aug | Outflow (m3/s) Jun-Aug | |
---|---|---|---|---|---|---|---|
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
14 | 1994/95 | 1110 | 5370 | 484 | 11486 | 103 | 1231 |
15 | 1995/96 | 856 | 3479 | 245 | 5515 | 172 | 1439 |
8 | 1988/89 | 976 | 4330 | 309 | 6465 | 200 | 1440 |
25 | 2005/06 | 1046 | 4059 | 206 | 4578 | 188 | 1474 |
22 | 2002/03 | 1021 | 4366 | 272 | 7093 | 176 | 1478 |
13 | 1993/94 | 1162 | 5583 | 422 | 10109 | 193 | 1638 |
9 | 1989/90 | 1130 | 4973 | 470 | 10520 | 209 | 1740 |
32 | 2012/13 | 1090 | 5329 | 350 | 9615 | 187 | 1797 |
24 | 2004/05 | 1095 | 4547 | 309 | 7183 | 217 | 1799 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
10 | 1990/91 | 1022 | 4418 | 305 | 7120 | 216 | 1923 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
29 | 2009/10 | 1103 | 4738 | 255 | 6435 | 244 | 1958 |
16 | 1996/97 | 1047 | 4019 | 258 | 5770 | 256 | 2102 |
11 | 1991/92 | 1151 | 4506 | 246 | 5493 | 280 | 2118 |
20 | 2000/01 | 1239 | 6092 | 328 | 9347 | 236 | 2142 |
19 | 1999/00 | 1204 | 5665 | 417 | 10021 | 197 | 2166 |
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
18 | 1998/99 | 1268 | 5824 | 360 | 8771 | 225 | 2240 |
12 | 1992/93 | 1130 | 5246 | 308 | 8751 | 219 | 2551 |
5 | 1985/86 | 1027 | 4991 | 304 | 7951 | 229 | 2595 |
6 | 1986/87 | 1151 | 5196 | 295 | 7593 | 267 | 2826 |
30 | 2010/11 | 1053 | 4521 | 265 | 6593 | 267 | 2885 |
23 | 2003/04 | 1165 | 4275 | 348 | 7493 | 315 | 2959 |
7 | 1987/88 | 1210 | 5572 | 343 | 8456 | 294 | 3154 |
21 | 2001/02 | 1185 | 5402 | 380 | 8891 | 259 | 3187 |
28 | 2008/09 | 1139 | 4941 | 268 | 6690 | 323 | 3189 |
17 | 1997/98 | 1169 | 4953 | 341 | 7747 | 285 | 3206 |
27 | 2007/08 | 1225 | 5497 | 386 | 9485 | 320 | 3505 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
26 | 2006/07 | 1387 | 6391 | 437 | 10926 | 357 | 5168 |
31 | 2011/12 | 1285 | 5500 | 339 | 7630 | 379 | 5261 |
測試一下header
#匯入資料
#uk_rain_2014.csv 第一行不是列標籤
df1=pd.read_csv('C:\\Users\\cc\\Desktop\\Pandas\\uk_rain_2014NoColumns.csv')#預設header=0 看看結果
df1#j結果:把第一行的資料當成列標籤
1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 | |
---|---|---|---|---|---|---|---|
0 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
1 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
2 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
3 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
4 | 1985/86 | 1027 | 4991 | 304 | 7951 | 229 | 2595 |
5 | 1986/87 | 1151 | 5196 | 295 | 7593 | 267 | 2826 |
6 | 1987/88 | 1210 | 5572 | 343 | 8456 | 294 | 3154 |
df2=pd.read_csv('C:\\Users\\cc\\Desktop\\Pandas\\uk_rain_2014NoColumns.csv', header=None)#看看結果
print(df2)#j結果:自動給了一個range(7)的columns
print(df2.columns)
0 1 2 3 4 5 6
0 1980/81 1182 5408 292 7248 174 2212
1 1981/82 1098 5112 257 7316 242 1936
2 1982/83 1156 5701 330 8567 124 1802
3 1983/84 993 4265 391 8905 141 1078
4 1984/85 1182 5364 217 5813 343 4313
5 1985/86 1027 4991 304 7951 229 2595
6 1986/87 1151 5196 295 7593 267 2826
7 1987/88 1210 5572 343 8456 294 3154
Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64')
#手動給每一列的name;names引數(相當於columns)
df3=pd.read_csv('C:\\Users\\cc\\Desktop\\Pandas\\uk_rain_2014NoColumns.csv', names=[u'Water Year', u'Rain (mm) Oct-Sep', u'Outflow (m3/s) Oct-Sep',
u'Rain (mm) Dec-Feb', u'Outflow (m3/s) Dec-Feb', u'Rain (mm) Jun-Aug',
u'Outflow (m3/s) Jun-Aug'])
df3
Water Year | Rain (mm) Oct-Sep | Outflow (m3/s) Oct-Sep | Rain (mm) Dec-Feb | Outflow (m3/s) Dec-Feb | Rain (mm) Jun-Aug | Outflow (m3/s) Jun-Aug | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
5 | 1985/86 | 1027 | 4991 | 304 | 7951 | 229 | 2595 |
6 | 1986/87 | 1151 | 5196 | 295 | 7593 | 267 | 2826 |
7 | 1987/88 | 1210 | 5572 | 343 | 8456 | 294 | 3154 |
#檢視前x行的資料
df.head(5)
Water Year | Rain (mm) Oct-Sep | Outflow (m3/s) Oct-Sep | Rain (mm) Dec-Feb | Outflow (m3/s) Dec-Feb | Rain (mm) Jun-Aug | Outflow (m3/s) Jun-Aug | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
#c檢視後5行
df.tail(5)
Water Year | Rain (mm) Oct-Sep | Outflow (m3/s) Oct-Sep | Rain (mm) Dec-Feb | Outflow (m3/s) Dec-Feb | Rain (mm) Jun-Aug | Outflow (m3/s) Jun-Aug | |
---|---|---|---|---|---|---|---|
28 | 2008/09 | 1139 | 4941 | 268 | 6690 | 323 | 3189 |
29 | 2009/10 | 1103 | 4738 | 255 | 6435 | 244 | 1958 |
30 | 2010/11 | 1053 | 4521 | 265 | 6593 | 267 | 2885 |
31 | 2011/12 | 1285 | 5500 | 339 | 7630 | 379 | 5261 |
32 | 2012/13 | 1090 | 5329 | 350 | 9615 | 187 | 1797 |
#改變列名
#列名太長,看著太煩
df.columns=['water_year','rain_octsep', 'outflow_octsep',
'rain_decfeb', 'outflow_decfeb', 'rain_junaug', 'outflow_junaug']
df.head(5)
water_year | rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
#檢視有多少條記錄
len(df)
33
過濾
df['rain_octsep']
df['rain_octsep'].head(8)
0 1182
1 1098
2 1156
3 993
4 1182
5 1027
6 1151
7 1210
Name: rain_octsep, dtype: int64
df.rain_octsep<1000
0 False
1 False
2 False
3 True
4 False
5 False
6 False
7 False
8 True
9 False
10 False
11 False
12 False
13 False
14 False
15 True
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 False
25 False
26 False
27 False
28 False
29 False
30 False
31 False
32 False
Name: rain_octsep, dtype: bool
#要9月-10月降雨量小於1000mm的記錄
df[df.rain_octsep<1000]
water_year | rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|---|
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
8 | 1988/89 | 976 | 4330 | 309 | 6465 | 200 | 1440 |
15 | 1995/96 | 856 | 3479 | 245 | 5515 | 172 | 1439 |
#rain_octsep<1000且outflow_octsep<4000的記錄
df[(df.rain_octsep<1000)&(df.outflow_octsep<4000)]
water_year | rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|---|
15 | 1995/96 | 856 | 3479 | 245 | 5515 | 172 | 1439 |
#要年份為199開頭的
df[df.water_year.str.startswith('199')]
water_year | rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|---|
10 | 1990/91 | 1022 | 4418 | 305 | 7120 | 216 | 1923 |
11 | 1991/92 | 1151 | 4506 | 246 | 5493 | 280 | 2118 |
12 | 1992/93 | 1130 | 5246 | 308 | 8751 | 219 | 2551 |
13 | 1993/94 | 1162 | 5583 | 422 | 10109 | 193 | 1638 |
14 | 1994/95 | 1110 | 5370 | 484 | 11486 | 103 | 1231 |
15 | 1995/96 | 856 | 3479 | 245 | 5515 | 172 | 1439 |
16 | 1996/97 | 1047 | 4019 | 258 | 5770 | 256 | 2102 |
17 | 1997/98 | 1169 | 4953 | 341 | 7747 | 285 | 3206 |
18 | 1998/99 | 1268 | 5824 | 360 | 8771 | 225 | 2240 |
19 | 1999/00 | 1204 | 5665 | 417 | 10021 | 197 | 2166 |
索引
#可以用iloc iloc[行] iloc[行,列]獲取某一區域的資料
df.iloc[19]
water_year 1999/00
rain_octsep 1204
outflow_octsep 5665
rain_decfeb 417
outflow_decfeb 10021
rain_junaug 197
outflow_junaug 2166
Name: 19, dtype: object
df.iloc[0:5] #相當於df.head(5)
water_year | rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
改變行索引
df= pd.read_csv('C:\\Users\\cc\\Desktop\\Pandas\\uk_rain_2014.csv',header=0)
df.columns=['water_year','rain_octsep', 'outflow_octsep',
'rain_decfeb', 'outflow_decfeb', 'rain_junaug', 'outflow_junaug']
df.head(5)
water_year | rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|---|
0 | 1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1 | 1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
2 | 1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
3 | 1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
4 | 1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
#將某一列設定成行索引
df= df.set_index("water_year")
df.head(5)
#恢復: df.reset_index("water_year")
rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|
water_year | ||||||
1980/81 | 1182 | 5408 | 292 | 7248 | 174 | 2212 |
1981/82 | 1098 | 5112 | 257 | 7316 | 242 | 1936 |
1982/83 | 1156 | 5701 | 330 | 8567 | 124 | 1802 |
1983/84 | 993 | 4265 | 391 | 8905 | 141 | 1078 |
1984/85 | 1182 | 5364 | 217 | 5813 | 343 | 4313 |
df.iloc[2]
rain_octsep 1156
outflow_octsep 5701
rain_decfeb 330
outflow_decfeb 8567
rain_junaug 124
outflow_junaug 1802
Name: 1982/83, dtype: int64
df.loc['1982/83']
rain_octsep 1156
outflow_octsep 5701
rain_decfeb 330
outflow_decfeb 8567
rain_junaug 124
outflow_junaug 1802
Name: 1982/83, dtype: int64
df.loc['1982/83','rain_octsep']
1156
df.loc['1982/83':'1984/85','rain_octsep':'outflow_decfeb']
rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | |
---|---|---|---|---|
water_year | ||||
1982/83 | 1156 | 5701 | 330 | 8567 |
1983/84 | 993 | 4265 | 391 | 8905 |
1984/85 | 1182 | 5364 | 217 | 5813 |
#按索引降序
df.sort_index(axis=0, ascending=False)
rain_octsep | outflow_octsep | rain_decfeb | outflow_decfeb | rain_junaug | outflow_junaug | |
---|---|---|---|---|---|---|
water_year | ||||||
2012/13 | 1090 | 5329 | 350 | 9615 | 187 | 1797 |
2011/12 | 1285 | 5500 | 339 | 7630 | 379 | 5261 |
2010/11 | 1053 | 4521 | 265 | 6593 | 267 | 2885 |
2009/10 | 1103 | 4738 | 255 | 6435 | 244 | 1958 |
2008/09 | 1139 | 4941 | 268 | 6690 | 323 | 3189 |
2007/08 | 1225 | 5497 | 386 | 9485 | 320 | 3505 |
2006/07 | 1387 | 6391 | 437 | 10926 | 357 | 5168 |
2005/06 | 1046 | 4059 | 206 | 4578 | 188 | 1474 |
2004/05 | 1095 | 4547 | 309 | 7183 | 217 | 1799 |
2003/04 | 1165 | 4275 | 348 | 7493 | 315 | 2959 |
2002/03 | 1021 | 4366 | 272 | 7093 | 176 | 1478 |
2001/02 | 1185 | 5402 | 380 | 8891 | 259 | 3187 |
2000/01 | 1239 | 6092 | 328 | 9347 | 236 | 2142 |
1999/00 | 1204 | 5665 | 417 | 10021 | 197 | 2166 |
1998/99 | 1268 | 5824 | 360 | 8771 | 225 | 2240 |
1997/98 | 1169 | 4953 | 341 | 7747 | 285 | 3206 |
1996/97 | 1047 | 4019 | 258 | 5770 | 256 | 2102 |
1995/96 | 856 | 3479 | 245 | 5515 | 172 | 1439 |
1994/95 | 1110 | 5370 | 484 | 11486 | 103 | 1231 |
1993/94 | 1162 | 5583 | 422 | 10109 | 193 | 1638 |
1992/93 | 1130 | 5246 | 308 | 8751 | 219 | 2551 |
1991/92 | 1151 | 4506 | 246 | 5493 | 280 | 2118 |
1990/91 | 1022 | 4418 | 305 | 7120 | 216 | 1923 |
1989/90 | 1130 | 4973 | 470 | 10520 | 209 | 1740 |
1988/89 | 976 | 4330 | 309 | 6465 | 200 | 1440 |
1987/88 | 1210 | 5572 | 343 | 8456 | 294 | 3154 |
1986/87 | 1151 | 5196 | 295 | 7593 | 267 | 2826 |
1985/86 | 1027 | 4991 | 304 | 7951 | 229 | 2595 |
1984/85 | 1182 | 5364 | 217 | 5813 |