3.5.2 索引
阿新 • • 發佈:2020-09-20
1 import numpy as np 2 import pandas as pd 3 df = pd.read_csv('table.csv',index_col='ID') #用來指定表格的索引值 4 5 df.head(2)
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1101 | S_1 | C_1 | M | street_1 | 173 | 63 | 34.0 | A+ |
1102 | S_1 | C_1 | F | street_2 | 192 | 73 | 32.5 | B+ |
2.索引
1)loc:標籤索引;遵循左閉右閉
a)單行索引
1 df.loc[1103]
School S_1 Class C_1 Gender M Address street_2 Height 186 Weight 82 Math 87.2 Physics B+ Name: 1103, dtype: object
b)多行索引
1 df.loc[[1101,1105,1204,1301]]
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1101 | S_1 | C_1 | M | street_1 | 173 | 63 | 34.0 | A+ |
1105 | S_1 | C_1 | F | street_4 | 159 | 64 | 84.8 | B+ |
1204 | S_1 | C_2 | F | street_5 | 162 | 63 | 33.8 | B |
1301 | S_1 | C_3 | M | street_4 | 161 | 68 | 31.5 | B+ |
1 df.loc[1103:1203]
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1103 | S_1 | C_1 | M | street_2 | 186 | 82 | 87.2 | B+ |
1104 | S_1 | C_1 | F | street_2 | 167 | 81 | 80.4 | B- |
1105 | S_1 | C_1 | F | street_4 | 159 | 64 | 84.8 | B+ |
1201 | S_1 | C_2 | M | street_5 | 188 | 68 | 97.0 | A- |
1202 | S_1 | C_2 | F | street_4 | 176 | 94 | 63.5 | B- |
1203 | S_1 | C_2 | M | street_6 | 160 | 53 | 58.8 | A+ |
c)單列索引
1 df.loc[:,'Weight'].head(3)
ID 1101 63 1102 73 1103 82 Name: Weight, dtype: int64
d)多列索引
1 df.loc[:,['Address','Height','Math']].head()
Address | Height | Math | |
---|---|---|---|
ID | |||
1101 | street_1 | 173 | 34.0 |
1102 | street_2 | 192 | 32.5 |
1103 | street_2 | 186 | 87.2 |
1104 | street_2 | 167 | 80.4 |
1105 | street_4 | 159 | 84.8 |
d)綜合索引
1 df.loc[1102:2301,['Address','Height','Math']].head()
Address | Height | Math | |
---|---|---|---|
ID | |||
1102 | street_2 | 192 | 32.5 |
1103 | street_2 | 186 | 87.2 |
1104 | street_2 | 167 | 80.4 |
1105 | street_4 | 159 | 84.8 |
1201 | street_5 | 188 | 97.0 |
2)iloc:位置索引;遵循左閉右開
a)單行索引
1 df.head(9)
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1101 | S_1 | C_1 | M | street_1 | 173 | 63 | 34.0 | A+ |
1102 | S_1 | C_1 | F | street_2 | 192 | 73 | 32.5 | B+ |
1103 | S_1 | C_1 | M | street_2 | 186 | 82 | 87.2 | B+ |
1104 | S_1 | C_1 | F | street_2 | 167 | 81 | 80.4 | B- |
1105 | S_1 | C_1 | F | street_4 | 159 | 64 | 84.8 | B+ |
1201 | S_1 | C_2 | M | street_5 | 188 | 68 | 97.0 | A- |
1202 | S_1 | C_2 | F | street_4 | 176 | 94 | 63.5 | B- |
1203 | S_1 | C_2 | M | street_6 | 160 | 53 | 58.8 | A+ |
1204 | S_1 | C_2 | F | street_5 | 162 | 63 | 33.8 | B |
1 df.iloc[2]
School S_1 Class C_1 Gender M Address street_2 Height 186 Weight 82 Math 87.2 Physics B+ Name: 1103, dtype: object
b)多行索引
1 df.iloc[2:6]
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1103 | S_1 | C_1 | M | street_2 | 186 | 82 | 87.2 | B+ |
1104 | S_1 | C_1 | F | street_2 | 167 | 81 | 80.4 | B- |
1105 | S_1 | C_1 | F | street_4 | 159 | 64 | 84.8 | B+ |
1201 | S_1 | C_2 | M | street_5 | 188 | 68 | 97.0 | A- |
c)單例索引
1 df.iloc[:,4].head(3)
ID 1101 173 1102 192 1103 186 Name: Height, dtype: int64
d)多列索引
1 df.iloc[:,7::-2].head(3)
Physics | Weight | Address | Class | |
---|---|---|---|---|
ID | ||||
1101 | A+ | 63 | street_1 | C_1 |
1102 | B+ | 73 | street_2 | C_1 |
1103 | B+ | 82 | street_2 | C_1 |
e)綜合索引
1 df.iloc[2:6,7::-2].head(3)
Physics | Weight | Address | Class | |
---|---|---|---|---|
ID | ||||
1103 | B+ | 82 | street_2 | C_1 |
1104 | B- | 81 | street_2 | C_1 |
1105 | B+ | 64 | street_4 | C_1 |
3.常用索引函式
a)where函式 對條件為False的單元進行填充
1 df.head()
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1101 | S_1 | C_1 | M | street_1 | 173 | 63 | 34.0 | A+ |
1102 | S_1 | C_1 | F | street_2 | 192 | 73 | 32.5 | B+ |
1103 | S_1 | C_1 | M | street_2 | 186 | 82 | 87.2 | B+ |
1104 | S_1 | C_1 | F | street_2 | 167 | 81 | 80.4 | B- |
1105 | S_1 | C_1 | F | street_4 | 159 | 64 | 84.8 | B+ |
1 df['Gender'].unique() 2 array(['M', 'F'], dtype=object) 3 df.where(df['Gender']=='M').head()
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1101 | S_1 | C_1 | M | street_1 | 173.0 | 63.0 | 34.0 | A+ |
1102 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1103 | S_1 | C_1 | M | street_2 | 186.0 | 82.0 | 87.2 | B+ |
1104 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1105 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 aa = df.where(df['Gender']=='M').dropna().head() 2 #意思是:在通過以上的操作,刪除掉單元格中不滿足條件的行,或提取出篩選後的新陣列 3 #mask對條件為True的單元進行填充 4 aa
School | Class | Gender | Address | Height | Weight | Math | Physics | |
---|---|---|---|---|---|---|---|---|
ID | ||||||||
1101 | S_1 | C_1 | M | street_1 | 173.0 | 63.0 | 34.0 | A+ |
1103 | S_1 | C_1 | M | street_2 | 186.0 | 82.0 | 87.2 | B+ |
1201 | S_1 | C_2 | M | street_5 | 188.0 | 68.0 | 97.0 | A- |
1203 | S_1 | C_2 | M | street_6 | 160.0 | 53.0 | 58.8 | A+ |
1301 | S_1 | C_3 | M | street_4 | 161.0 | 68.0 | 31.5 | B+ |