numpy一些常用函式小結
最近在學numpy,學習一門新的語言總有許多的API,而這些API又雜又容易忘記,所以我把一些常用的函式記錄下來以便以後隨時查閱
1.np.sqrt()函式用來給一個列表中每一個元素求根號
import numpy as np
import numpy.random as np_random
arr = np.arange(10)
print arr
#out:[0 1 2 3 4 5 6 7 8 9]
print np.sqrt(arr)
#out:[ 0. 1. 1.41421356 1.73205081 2. 2.23606798 2.44948974 2.64575131 2.82842712 3. ]
2.np.maximum(x,y)函式用來求給定的兩個兩個列表中的對應每個元素的最大值,並返回列表
import numpy as np
import numpy.random as np_random
x = np_random.randn(8)
y = np_random.randn(8)
print x,y,np.maximum(x,y)
#x = [-0.4628023 -0.60578152 -0.16527033 0.99371095 -1.68726145 0.28045865 0.77197354 -0.08402748]
# y = [ 0.09538315 0.26688981 -0.10223 -1.40979706 1.94563655 2.24729599 1.15956752 0.83226026]
# np.maximum = [ 0.09538315 0.26688981 -0.10223 0.99371095 1.94563655 2.24729599 1.15956752 0.83226026]
3.np.modf(x)函式返回兩個列表一個是整數部分列表,一個是小數部分列表
import numpy as np
import numpy.random as np_random
arr = np_random.randn(7) * 5
print 'arr = ',arr,'np.modf = ',np.modf(arr)
#arr = [ 4.66513489 1.29033991 1.48894422 -3.21496179 -1.4010007 -2.25843728 -1.97491216]
np.modf = (array([ 0.66513489, 0.29033991, 0.48894422, -0.21496179, -0.4010007 ,-0.25843728, -0.97491216]), array([ 4., 1., 1., -3., -1., -2., -1.]))
4.矩陣轉置函式,矩陣點積,高維矩陣座標軸轉換的一些應用
#coding:utf-8
#矩陣轉置
import numpy as np
import numpy.random as np_random
arr = np.arange(15).reshape((3,5))
#print arr
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
#print arr.T
[[ 0 5 10]
[ 1 6 11]
[ 2 7 12]
[ 3 8 13]
[ 4 9 14]]
#矩陣點積
arr = np_random.randn(6,3)
print arr,np.dot(arr.T,arr)
#out:[[ 1.23173257 0.80268393 1.25454263]
[-0.65430351 0.99088605 -1.26507529]
[ 0.14790816 0.10454909 1.25549745]
[-0.25097523 -1.42844608 0.25598424]
[-1.86467754 -0.46932851 2.33760553]
[ 0.58934308 0.14976993 0.41664656]]
[[ 5.85449118 1.67773218 -1.61887592]
[ 1.67773218 3.92024567 -1.51564661]
#高維矩陣的座標軸轉換
arr = np.arange(16).reshape((2,2,4))
print arr
[[[ 0 1 2 3]
[ 4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]]
print arr.transpose(1,0,2)
[[[ 0 1 2 3]
[ 8 9 10 11]]
[[ 4 5 6 7]
[12 13 14 15]]]
print arr.swapaxes(1,2)
[[[ 0 4]
[ 1 5]
[ 2 6]
[ 3 7]]
[[ 8 12]
[ 9 13]
[10 14]
[11 15]]]
上面的三個函式中轉置與點積應該上過高數的人都懂,關鍵在於座標軸轉換函式,在高維中我們稱座標軸是從第0個軸開始的因此上面的arr.transpose(1,0,2)
就表示第2個軸不懂交換第0軸和第一軸,如下所示
詳細解釋:
arr陣列的內容為
- a[0][0] = [0, 1, 2, 3]
- a[0][1] = [4, 5, 6, 7]
- a[1][0] = [8, 9, 10, 11]
- a[1][1] = [12, 13, 14, 15]
transpose的引數為座標,正常順序為(0, 1, 2, … , n - 1),
現在傳入的為(1, 0, 2)代表a[x][y][z] = a[y][x][z],第0個和第1個座標互換。
- a’[0][0] = a[0][0] = [0, 1, 2, 3]
- a’[0][1] = a[1][0] = [8, 9, 10, 11]
- a’[1][0] = a[0][1] = [4, 5, 6, 7]
- a’[1][1] = a[1][1] = [12, 13, 14, 15]
arr.swapaxes(1,2)也是和上面差不多交換第一個軸和地二個軸
print arr.swapaxes(0,1)
[[[ 0 1 2 3]
[ 8 9 10 11]]
[[ 4 5 6 7]
[12 13 14 15]]]
可以看到和上面的print arr.transpose(1,0,2)一樣的結果
5.陣列對應的乘法、減法除法
import numpy as np
arr = np.array([[1,2,3.0],[4.0,5,6]])
print arr - arr
[[ 0. 0. 0.]
[ 0. 0. 0.]]
print arr * arr
[[ 1. 4. 9.]
[ 16. 25. 36.]]
print 1/arr
[[ 1. 0.5 0.33333333]
[ 0.25 0.2 0.16666667]]
print arr ** 0.5
[[ 1. 1.41421356 1.73205081]
[ 2. 2.23606798 2.44948974]]
6.陣列型別制定引數dtype與資料型別轉換函式astype
#coding:utf-8
print '生成陣列時指定資料型別'
import numpy as np
arr = np.array([1,2,3],dtype = np.float64)
print arr.dtype
#float64
arr = np.array([1,2,3],dtype = np.int32)
print arr.dtype
#int32
print '使用astype複製陣列並轉換資料型別'
int_arr = np.array([1,2,3,4,5])
float_arr = int_arr.astype(np.float)
print int_arr.dtype
#int64
print float_arr.dtype
#float64
print '使用astype將float轉換為int時小數部分被捨棄'
float_arr = np.array([2.4,1.4,4.5,1.5])
int_arr = float_arr.astype(dtype = np.int)
print int_arr
[2 1 4 1]
print '使用astype把字串轉換為陣列,如果失敗丟擲異常。'
str_arr = np.array(['1.2','2.4','2.3'],dtype = np.string_)
float_arr = str_arr.astype(dtype = np.float)
print float_arr
[ 1.2 2.4 2.3]
print 'astype使用其它陣列的資料型別作為引數'
int_arr = np.arange(10)
float_arr = np.array([1.2,2.3,5.6],dtype = np.float64)
print int_arr.astype(float_arr.dtype)
#[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
7.陣列索引的一些應用
#coding:utf-8
import numpy as np
arr = np.empty((8,4))
for i in range(8):
arr[i] = i
#列印第4行,第3行,第0行,第6行資料
print arr[[4,3,0,6]]
[[ 4. 4. 4. 4.]
[ 3. 3. 3. 3.]
[ 0. 0. 0. 0.]
[ 6. 6. 6. 6.]]
arr = np.arange(32).reshape((8,4))
print arr
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]
[16 17 18 19]
[20 21 22 23]
[24 25 26 27]
[28 29 30 31]]
#列印arr[1][0],arr[5][3],arr[7][1],arr[2,2]
print arr[[1,5,7,2],[0,3,1,2]]
[ 4 23 29 10]
#列印選定行的對應的列
print arr[[1,5,7,2]][:,[0,3,1,2]]
[[ 4 7 5 6]
[20 23 21 22]
[28 31 29 30]
[ 8 11 9 10]]
print arr[np.ix_([1,5,7,2],[0,3,1,2])]
[[ 4 7 5 6]
[20 23 21 22]
[28 31 29 30]
[ 8 11 9 10]]
8.np.zero()函式用來生成全是0的陣列與np.empty()用來生成一個空的陣列(雖然可能在你電腦上使用這個函式的時候有值,值可能是0或者其它,但其實它生成的陣列應該是一個空的)
import numpy as np
np.zeros(10)
Out[69]: array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
print np.zeros((3,6))
[[ 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0.]]
print np.empty((2,3,2))
[[[ 4. 4.]
[ 4. 4.]
[ 3. 3.]]
[[ 3. 3.]
[ 6. 6.]
[ 6. 6.]]]
9.布林索引的一些應用
import numpy as np
import numpy.random as np_random
name_arr =np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])
rand_arr = np_random.randn(7,4)
print rand_arr
[[-1.82397798 -0.02475999 -1.05391711 -0.09195784]
[ 0.58072312 -0.60623037 -0.72832453 -0.3967706 ]
[-0.98425286 0.20487701 -0.50738989 -1.18534881]
[-2.04733816 0.98808353 -0.25746984 -0.08985969]
[-0.11875836 -0.95636464 1.41807321 0.38314495]
[-0.70933744 -0.53211986 -1.58546288 0.3870686 ]
[ 0.28461165 -1.22913798 2.97719825 0.47703007]]
#列印對於索引為True的值也就是第0行和第3行
print rand_arr[name_arr == 'Bob']
[[-1.82397798 -0.02475999 -1.05391711 -0.09195784]
[-2.04733816 0.98808353 -0.25746984 -0.08985969]]
#列印對於的行和列
print rand_arr[name_arr == 'Bob',:2]
[[-1.82397798 -0.02475999]
[-2.04733816 0.98808353]]
#列印取反的對應的行
print rand_arr[~(name_arr == 'Bob')]
[[ 0.58072312 -0.60623037 -0.72832453 -0.3967706 ]
[-0.98425286 0.20487701 -0.50738989 -1.18534881]
[-0.11875836 -0.95636464 1.41807321 0.38314495]
[-0.70933744 -0.53211986 -1.58546288 0.3870686 ]
[ 0.28461165 -1.22913798 2.97719825 0.47703007]]
mask_arr = (name_arr == 'Bob')|(name_arr == 'Will')
print rand_arr[mask_arr]
[[-1.82397798 -0.02475999 -1.05391711 -0.09195784]
[-0.98425286 0.20487701 -0.50738989 -1.18534881]
[-2.04733816 0.98808353 -0.25746984 -0.08985969]
[-0.11875836 -0.95636464 1.41807321 0.38314495]]
rand_arr[name_arr != 'Joe'] = 7
print rand_arr
[[ 7. 7. 7. 7. ]
[ 0.58072312 -0.60623037 -0.72832453 -0.3967706 ]
[ 7. 7. 7. 7. ]
[ 7. 7. 7. 7. ]
[ 7. 7. 7. 7. ]
[-0.70933744 -0.53211986 -1.58546288 0.3870686 ]
[ 0.28461165 -1.22913798 2.97719825 0.47703007]]
10.numpy.linalg中的求逆函式inv,與qr分解
import numpy as np
import numpy.random as np_random
from numpy.linalg import inv,qr
x = np.array([[1.0,2,3],[4,5,6]])
y = np.array([[6,23],[-1,7],[8,9]])
mat = x.T.dot(x)
print inv(mat)
x = np_random.randn(5,5)
mat = x.T.dot(x)
inv(mat)
Out:
array([[ 11.84687444, -11.56145594, -5.65725983, 10.87964992,
7.08449774],
[-11.56145594, 12.09232574, 5.98095728, -11.39683366,
-6.70063725],
[ -5.65725983, 5.98095728, 3.23348475, -5.8853187 ,
-3.50549578],
[ 10.87964992, -11.39683366, -5.8853187 , 11.21150189,
6.53822392],
[ 7.08449774, -6.70063725, -3.50549578, 6.53822392,
4.66110929]])
mat.dot(inv(mat))
Out:
array([[ 1.00000000e+00, 3.39716498e-15, 1.49782320e-15,
1.83676775e-15, 0.00000000e+00],
[ 1.42818200e-14, 1.00000000e+00, 1.23314155e-15,
3.83692157e-15, 0.00000000e+00],
[ 2.32658788e-15, -5.22866192e-16, 1.00000000e+00,
-5.09344939e-15, 0.00000000e+00],
[ -5.16927092e-15, -1.24169261e-16, 2.70233995e-15,
1.00000000e+00, -2.22044605e-15],
[ 0.00000000e+00, -2.13162821e-14, 3.55271368e-15,
0.00000000e+00, 1.00000000e+00]])
print mat
[[ 7.02225343 6.26630278 -5.71904522 0.17946762 -6.2179201 ]
[ 6.26630278 7.57852343 -5.01182539 2.14901066 -5.41337169]
[ -5.71904522 -5.01182539 11.92261964 3.39074865 5.69807664]
[ 0.17946762 2.14901066 3.39074865 4.12046195 -0.41319453]
[ -6.2179201 -5.41337169 5.69807664 -0.41319453 6.74816678]]
q,r = qr(mat)
print q
[[-0.55519117 0.18834857 -0.45498506 0.39651103 0.54042129]
[-0.49542444 -0.57697237 0.09716331 0.3885269 -0.51113956]
[ 0.45215734 -0.18772943 -0.81625432 0.15011252 -0.26740704]
[-0.01418901 -0.74100777 -0.08208317 -0.44183326 0.49875031]
[ 0.49159923 -0.21746917 0.33247425 0.68853137 0.35555982]]
print r
[[-12.64835211 -12.19141702 13.80210286 0.10724925 12.03373455]
[ 0. -2.66667445 -4.17544298 -4.80609789 -0.27880506]
[ 0. 0. -6.00063474 -3.11616047 -0.07049193]
[ 0. 0. 0. -0.68995137 1.11552585]
[ 0. 0. 0. 0. 0.07628223]]
11.讀取檔案的函式np.loadtxt()
建立個array_ex.txt檔案(預設工作目錄),裡面填充如下資料:
1,2,3,4
2,3,4,5
4,5,6,7
1,2,3,4
用np.loadtxt函式讀取它
arr = np.loadtxt('array_ex.txt',delimiter=',')
arr
array([[ 1., 2., 3., 4.],
[ 2., 3., 4., 5.],
[ 4., 5., 6., 7.],
[ 1., 2., 3., 4.]])
12.儲存檔案函式np.save()與載入函式np.load(),以及多個數組壓縮儲存函式np.savez()
#coding:utf-8
import numpy as np
arr = np.arange(10)
#在當前路徑將arr裡面的內容寫入到some_array檔案中若檔案不存在建立檔案,存在則清空當前內容
np.save('some_array',arr)
print np.load('some_array.npy')
#[0 1 2 3 4 5 6 7 8 9]
#將多個數組壓縮儲存在一個檔案中
np.savez('array_archive.npz',a = arr,b = arr)
arch = np.load('array_archive.npz')
arch['b']
Out[129]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
13.高維函式拉平函式np.ravel()
#coding:utf-8
import numpy as np
arr = np.arange(15).reshape((5,3))
#無論之前是幾維最後都變成一維
print arr.ravel()
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14]
14.np.unique()函式用來去重查詢一個元素是否在另外一個元素中np.in1d()
import numpy as np
names = np.array(['Bob','Joe','Will','Bob','will','Joe','Joe'])
print set(names)
set(['Will', 'will', 'Bob', 'Joe'])
print sorted(set(names))
['Bob', 'Joe', 'Will', 'will']
print np.unique(names)
['Bob' 'Joe' 'Will' 'will']
values = np.array([6,0,0,3,2,5,6])
print np.in1d(values,[2,3,6])
[ True False False True True False True]
15.np.repeat()函式與np.tile()函式
import numpy as np
arr = np.arange(4)
print arr.repeat(3)
#[0 0 0 1 1 1 2 2 2 3 3 3]
print arr.repeat([2,3,4,5])
#[0 0 1 1 1 2 2 2 2 3 3 3 3 3]
arr = arr.reshape((2,2))
print arr.repeat(2,axis = 0)
[[0 1]
[0 1]
[2 3]
[2 3]]
print arr.repeat(2,axis = 1)
[[0 0 1 1]
[2 2 3 3]]
print np.tile(arr,2)
[[0 1 0 1]
[2 3 2 3]]
print np.tile(arr,(2,3))
[[0 1 0 1 0 1]
[2 3 2 3 2 3]
[0 1 0 1 0 1]
[2 3 2 3 2 3]]
16.np.take()函式與np.put()函式
import numpy as np
import numpy.random as np_random
arr = np.arange(10) * 100
inds = [7,1,2,6]
print arr[inds]
#[700 100 200 600]
print arr.take(inds)
#[700 100 200 600]
arr.put(inds,50)
print arr
#[ 0 50 50 300 400 500 50 50 800 900]