1. 程式人生 > 程式設計 >python pandas.DataFrame.loc函式使用詳解

python pandas.DataFrame.loc函式使用詳解

官方函式

DataFrame.loc
Access a group of rows and columns by label(s) or a boolean array.
.loc[] is primarily label based,but may also be used with a boolean array.
# 可以使用label值,但是也可以使用布林值

  • Allowed inputs are: # 可以接受單個的label,多個label的列表,多個label的切片
  • A single label,e.g. 5 or ‘a',(note that 5 is interpreted as a label of the index,and never as an integer position along the index). #這裡的5不是數值指定的位置,而是label值
  • A list or array of labels,e.g. [‘a',‘b',‘c'].

slice object with labels,e.g. ‘a':'f'.

Warning: #如果使用多個label的切片,那麼切片的起始位置都是包含的

Note that contrary to usual python slices,both the start and the stop are included

  • A boolean array of the same length as the axis being sliced,e.g. [True,False,True].

例項詳解

一、選擇數值

1、生成df

df = pd.DataFrame([[1,2],[4,5],[7,8]],...   index=['cobra','viper','sidewinder'],...   columns=['max_speed','shield'])

df
Out[15]: 
      max_speed shield
cobra        1    2
viper        4    5
sidewinder     7    8

2、Single label. 單個 row_label 返回的Series

df.loc['viper']
Out[17]: 
max_speed  4
shield    5
Name: viper,dtype: int64

2、List of labels. 列表 row_label 返回的DataFrame

df.loc[['cobra','viper']]
Out[20]: 
    max_speed shield
cobra     1    2
viper     4    5

3、Single label for row and column 同時選定行和列

df.loc['cobra','shield']
Out[24]: 2

4、Slice with labels for row and single label for column. As mentioned above,note that both the start and stop of the slice are included. 同時選定多個行和單個列,注意的是通過列表選定多個row label 時,首位均是選定的。

df.loc['cobra':'viper','max_speed']
Out[25]: 
cobra  1
viper  4
Name: max_speed,dtype: int64

5、Boolean list with the same length as the row axis 布林列表選擇row label
布林值列表是根據某個位置的True or False 來選定,如果某個位置的布林值是True,則選定該row

df
Out[30]: 
      max_speed shield
cobra        1    2
viper        4    5
sidewinder     7    8

df.loc[[True]]
Out[31]: 
    max_speed shield
cobra     1    2

df.loc[[True,False]]
Out[32]: 
    max_speed shield
cobra     1    2

df.loc[[True,True]]
Out[33]: 
      max_speed shield
cobra        1    2
sidewinder     7    8

6、Conditional that returns a boolean Series 條件布林值

df.loc[df['shield'] > 6]
Out[34]: 
      max_speed shield
sidewinder     7    8

7、Conditional that returns a boolean Series with column labels specified 條件布林值和具體某列的資料

df.loc[df['shield'] > 6,['max_speed']]
Out[35]: 
      max_speed
sidewinder     7

8、Callable that returns a boolean Series 通過函式得到布林結果選定資料

df
Out[37]: 
      max_speed shield
cobra        1    2
viper        4    5
sidewinder     7    8

df.loc[lambda df: df['shield'] == 8]
Out[38]: 
      max_speed shield
sidewinder     7    8

二、賦值

1、Set value for all items matching the list of labels 根據某列表選定的row 及某列 column 賦值

df.loc[['viper',['shield']] = 50

df
Out[43]: 
      max_speed shield
cobra        1    2
viper        4   50
sidewinder     7   50

2、Set value for an entire row 將某行row的資料全部賦值

df.loc['cobra'] =10

df
Out[48]: 
      max_speed shield
cobra       10   10
viper        4   50
sidewinder     7   50

3、Set value for an entire column 將某列的資料完全賦值

df.loc[:,'max_speed'] = 30

df
Out[50]: 
      max_speed shield
cobra       30   10
viper       30   50
sidewinder     30   50

4、Set value for rows matching callable condition 條件選定rows賦值

df.loc[df['shield'] > 35] = 0

df
Out[52]: 
      max_speed shield
cobra       30   10
viper        0    0
sidewinder     0    0

三、行索引是數值

df = pd.DataFrame([[1,...   index=[7,8,9],columns=['max_speed','shield'])

df
Out[54]: 
  max_speed shield
7     1    2
8     4    5
9     7    8

通過 行 rows的切片的方式取多個:

df.loc[7:9]
Out[55]: 
  max_speed shield
7     1    2
8     4    5
9     7    8

四、多維索引

1、生成多維索引

tuples = [
...  ('cobra','mark i'),('cobra','mark ii'),...  ('sidewinder',('sidewinder',...  ('viper',('viper','mark iii')
... ]
index = pd.MultiIndex.from_tuples(tuples)
values = [[12,[0,4],[10,20],...     [1,1],[16,36]]
df = pd.DataFrame(values,'shield'],index=index)


df
Out[57]: 
           max_speed shield
cobra   mark i      12    2
      mark ii      0    4
sidewinder mark i      10   20
      mark ii      1    4
viper   mark ii      7    1
      mark iii     16   36

2、Single label. 傳入的就是最外層的row label,返回DataFrame

df.loc['cobra']
Out[58]: 
     max_speed shield
mark i     12    2
mark ii     0    4

3、Single index tuple.傳入的是索引元組,返回Series

df.loc[('cobra','mark ii')]
Out[59]: 
max_speed  0
shield    4
Name: (cobra,mark ii),dtype: int64

4、Single label for row and column.如果傳入的是row和column,和傳入tuple是類似的,返回Series

df.loc['cobra','mark i']
Out[60]: 
max_speed  12
shield    2
Name: (cobra,mark i),dtype: int64

5、Single tuple. Note using [[ ]] returns a DataFrame.傳入一個數組,返回一個DataFrame

df.loc[[('cobra','mark ii')]]
Out[61]: 
        max_speed shield
cobra mark ii     0    4

6、Single tuple for the index with a single label for the column 獲取某個colum的某row的資料,需要左邊傳入多維索引的tuple,然後再傳入column

df.loc[('cobra','shield']
Out[62]: 2

7、傳入多維索引和單個索引的切片:

df.loc[('cobra','mark i'):'viper']
Out[63]: 
           max_speed shield
cobra   mark i      12    2
      mark ii      0    4
sidewinder mark i      10   20
      mark ii      1    4
viper   mark ii      7    1
      mark iii     16   36

df.loc[('cobra','mark i'):'sidewinder']
Out[64]: 
          max_speed shield
cobra   mark i     12    2
      mark ii     0    4
sidewinder mark i     10   20
      mark ii     1    4

df.loc[('cobra','mark i'):('sidewinder','mark i')]
Out[65]: 
          max_speed shield
cobra   mark i     12    2
      mark ii     0    4
sidewinder mark i     10   20

到此這篇關於python pandas.DataFrame.loc函式使用詳解的文章就介紹到這了,更多相關pandas.DataFrame.loc函式內容請搜尋我們以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援我們!