1. 程式人生 > >【新手】Numpy axis理解

【新手】Numpy axis理解

Numpy axis

  In NumPy dimensions are called axes. The dimensions of the array is shape, a tuple of integers indicating the size of the array in each dimension.
  By default, operations apply to the array as though it were a list of numbers, regardless of its shape. However, by specifying the axis parameter you can apply an operation along the specified axis of an array.

  How to along the specified axis is often a source of confusion for beginners.

Example:

>>> import numpy as np
>>> a = np.arange(24).reshape(2,3,4)
>>> a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16
, 17, 18, 19], [20, 21, 22, 23]]])

三維陣列 a 可以看做是2個前後分佈的3行4列陣列,按數學裡的模型“行”為x軸,“列”為y軸,“2個”為z軸。
對z軸(軸0)求最大值:

>>> a.max(axis=0)
array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

前後比較,結果為“後面”那個陣列,是對的。
對x軸(軸1)求最大值:

>>> a.max(axis=1)
array([[ 8,  9, 10
, 11], [20, 21, 22, 23]])

結果怎麼是這個,3行4列的陣列按行求最大值不應該是這樣的嗎:

array([[ 3,  7, 11],
       [15, 19, 23]])

為什麼是按列求最大值的結果,結果反了不是麼:

>>> a.max(axis=2)
array([[ 3,  7, 11],
       [15, 19, 23]])

真的反了嗎?Numpy有這種bug?來看看大神們的正解:
  By definition, the axis number of the dimension is the index of that dimension within the array’s shape. It is also the position used to access that dimension during indexing.
  For example, if a 2D array a has shape (5,6), then you can access a[0,0] up to a[4,5]. Axis 0 is thus the first dimension (the “rows”), and axis 1 is the second dimension (the “columns”). In higher dimensions, where “row” and “column” stop really making sense, try to think of the axes in terms of the shapes and indices involved.
  If you do .max(axis=n), for example, then dimension n is collapsed and deleted, with all values in the new matrix equal to the max of the corresponding collapsed values. For example, if b has shape (5,6,7,8), and you do c = b.max(axis=2), then axis 2 (dimension with size 7) is collapsed, and the result has shape (5,6,8). Furthermore, c[x,y,z] is equal to the max of all elements c[x,y,:,z].
  如果,b是一個shap(5, 6, 7, 8)的numpy array,
  然後,c = b.max(axis=2)
  那麼,c的shape將是(5, 6, 8) ,因為“7”就是axis=2,被清除了。
  而且,c[x, y, z] = max( b[x, y, : , z])
  
  如果這位外國友人還沒讓你明白,看看下面這位國人的中文解釋:
  通過不同的axis,numpy會沿著不同的方向進行操作:如果不設定,那麼對所有的元素操作;如果axis=0,則沿著縱軸進行操作;axis=1,則沿著橫軸進行操作。但這只是簡單的二位陣列,如果是多維的呢?可以總結為一句話:設axis=i,則numpy沿著第i個下標變化的放下進行操作。例如剛剛的例子,可以將表示為:data =[[a00, a01],[a10,a11]],所以axis=0時,沿著第0個下標變化的方向進行操作,也就是a00->a10, a01->a11,也就是縱座標的方向,axis=1時也類似。
  回到基本的數學概念:“沿著X軸”,它的物理意義是什麼?X為自變數不斷增大。再看 a.max(axis=1),a.max(axis=0)的結果:  

>>> a
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
>>> a.shape
(2, 3, 4)
>>> a.max(axis=1)
array([[ 8,  9, 10, 11],
       [20, 21, 22, 23]])
>>> a.max(axis=0)
array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])