1. 程式人生 > 其它 >100個Numpy練習【5】

100個Numpy練習【5】

Numpy是Python做資料分析必須掌握的基礎庫之一,非常適合剛學習完Numpy基礎的同學,完成以下習題可以幫助你更好的掌握這個基礎庫。

Python版本:Python 3.6.2

Numpy版本:Numpy 1.13.1

81. 考慮一個數組Z = [1,2,3,4,5,6,7,8,9,10,11,12,13,14],如何生成一個數組R = [[1,2,3,4], [2,3,4,5], [3,4,5,6], ...,[11,12,13,14]]

? (★★★)

(提示: stride_tricks.as_strided)

# Author: Stefan van der Walt

Z = np.arange(1,15,dtype=np.uint32)
R = stride_tricks.as_strided(Z,(11,4),(4,4))
print(R)

82. 計算矩陣的秩 (★★★)

(提示: np.linalg.svd)

# Author: Stefan van der Walt

Z = np.random.uniform(0,1,(10,10))
U, S, V = np.linalg.svd(Z) # Singular Value Decomposition
rank = np.sum(S > 1e-10)
print(rank)

83. 如何找出陣列中出現頻率最高的值?(★★★)

(提示: np.bincount, argmax)

Z = np.random.randint(0,10,50)
print(np.bincount(Z).argmax())

84. 從一個10x10的矩陣中提取出連續的3x3區塊(★★★)

(提示: stride_tricks.as_strided)

# Author: Chris Barker

Z = np.random.randint(0,5,(10,10))
n = 3
i = 1 + (Z.shape[0]-3)
j = 1 + (Z.shape[1]-3)
C = stride_tricks.as_strided(Z, shape=(i, j, n, n), strides=Z.strides + Z.strides)
print(C)

85.建立一個滿足 Z[i,j] == Z[j,i]的二維陣列子類 (★★★)

(提示: class method)

# Author: Eric O. Lebigot
# Note: only works for 2d array and value setting using indices

class Symetric(np.ndarray):
    def __setitem__(self, index, value):
        i,j = index
        super(Symetric, self).__setitem__((i,j), value)
        super(Symetric, self).__setitem__((j,i), value)

def symetric(Z):
    return np.asarray(Z + Z.T - np.diag(Z.diagonal())).view(Symetric)

S = symetric(np.random.randint(0,10,(5,5)))
S[2,3] = 42
print(S)

86. 考慮p個 nxn 矩陣和一組形狀為(n,1)的向量,如何直接計算p個矩陣的乘積(n,1)? (★★★)

(提示: np.tensordot)

# Author: Stefan van der Walt

p, n = 10, 20
M = np.ones((p,n,n))
V = np.ones((p,n,1))
S = np.tensordot(M, V, axes=[[0, 2], [0, 1]])
print(S)

# It works, because:
# M is (p,n,n)
# V is (p,n,1)
# Thus, summing over the paired axes 0 and 0 (of M and V independently),
# and 2 and 1, to remain with a (n,1) vector.

87. 對於一個16x16的陣列,如何得到一個區域的和(區域大小為4x4)? (★★★)

(提示: np.add.reduceat)

# Author: Robert Kern

Z = np.ones((16,16))
k = 4
S = np.add.reduceat(np.add.reduceat(Z, np.arange(0, Z.shape[0], k), axis=0), np.arange(0, Z.shape[1], k), axis=1)
print(S)

88. 如何利用numpy陣列實現Game of Life? (★★★)

(提示: Game of Life , Game of Life有哪些圖形?)

# Author: Nicolas Rougier

def iterate(Z):
    # Count neighbours
    N = (Z[0:-2,0:-2] + Z[0:-2,1:-1] + Z[0:-2,2:] +
         Z[1:-1,0:-2]                + Z[1:-1,2:] +
         Z[2:  ,0:-2] + Z[2:  ,1:-1] + Z[2:  ,2:])

    # Apply rules
    birth = (N==3) & (Z[1:-1,1:-1]==0)
    survive = ((N==2) | (N==3)) & (Z[1:-1,1:-1]==1)
    Z[...] = 0
    Z[1:-1,1:-1][birth | survive] = 1
    return Z

Z = np.random.randint(0,2,(50,50))
for i in range(100): Z = iterate(Z)
print(Z)

89. 如何找到一個數組的第n個最大值? (★★★)

(提示: np.argsort | np.argpartition)

Z = np.arange(10000)
np.random.shuffle(Z)
n = 5

# Slow
print (Z[np.argsort(Z)[-n:]])

# Fast
print (Z[np.argpartition(-Z,n)[:n]])

90. 給定任意個數向量,建立笛卡爾積(每一個元素的每一種組合) (★★★)

(提示: np.indices)

# Author: Stefan Van der Walt

def cartesian(arrays):
    arrays = [np.asarray(a) for a in arrays]
    shape = (len(x) for x in arrays)

    ix = np.indices(shape, dtype=int)
    ix = ix.reshape(len(arrays), -1).T

    for n, arr in enumerate(arrays):
        ix[:, n] = arrays[n][ix[:, n]]

    return ix

print (cartesian(([1, 2, 3], [4, 5], [6, 7])))

91. 如何從一個常規陣列中建立記錄陣列(record array)? (★★★)

(提示: np.core.records.fromarrays)

Z = np.array([("Hello", 2.5, 3),
              ("World", 3.6, 2)])
R = np.core.records.fromarrays(Z.T, 
                               names='col1, col2, col3',
                               formats = 'S8, f8, i8')
print(R)

92. 思考一個大向量Z, 用三種不同的方法計算它的立方 (★★★)

(提示: np.power, *, np.einsum)

# Author: Ryan G.

x = np.random.rand(5e7)

%timeit np.power(x,3)
%timeit x*x*x
%timeit np.einsum('i,i,i->i',x,x,x)

93. 考慮兩個形狀分別為(8,3) (2,2)的陣列AB. 如何在陣列A中找到滿足包含B中元素的行?(不考慮B中每行元素順序)? (★★★)

(提示: np.where)

# Author: Gabe Schwartz

A = np.random.randint(0,5,(8,3))
B = np.random.randint(0,5,(2,2))

C = (A[..., np.newaxis, np.newaxis] == B)
rows = np.where(C.any((3,1)).all(1))[0]
print(rows)

94. 思考一個10x3的矩陣,如何分解出有不全相同值的行 (如 [2,2,3]) (★★★)

# Author: Robert Kern

Z = np.random.randint(0,5,(10,3))
print(Z)
# solution for arrays of all dtypes (including string arrays and record arrays)
E = np.all(Z[:,1:] == Z[:,:-1], axis=1)
U = Z[~E]
print(U)
# soluiton for numerical arrays only, will work for any number of columns in Z
U = Z[Z.max(axis=1) != Z.min(axis=1),:]
print(U)

95. 將一個整數向量轉換為二進位制矩陣 (★★★)

(提示: np.unpackbits)

# Author: Warren Weckesser

I = np.array([0, 1, 2, 3, 15, 16, 32, 64, 128])
B = ((I.reshape(-1,1) & (2**np.arange(8))) != 0).astype(int)
print(B[:,::-1])

# Author: Daniel T. McDonald

I = np.array([0, 1, 2, 3, 15, 16, 32, 64, 128], dtype=np.uint8)
print(np.unpackbits(I[:, np.newaxis], axis=1))

96. 給定一個二維陣列,如何提取出唯一的行?(★★★)

(提示: np.ascontiguousarray)

# Author: Jaime Fernández del Río

Z = np.random.randint(0,2,(6,3))
T = np.ascontiguousarray(Z).view(np.dtype((np.void, Z.dtype.itemsize * Z.shape[1])))
_, idx = np.unique(T, return_index=True)
uZ = Z[idx]
print(uZ)

97. 考慮兩個向量AB,寫出用einsum等式對應的inner, outer, sum, mul函式 (★★★)

(提示: np.einsum)

# Author: Alex Riley
# Make sure to read: http://ajcr.net/Basic-guide-to-einsum/

A = np.random.uniform(0,1,10)
B = np.random.uniform(0,1,10)

np.einsum('i->', A)       # np.sum(A)
np.einsum('i,i->i', A, B) # A * B
np.einsum('i,i', A, B)    # np.inner(A, B)
np.einsum('i,j->ij', A, B)    # np.outer(A, B)

98. 考慮一個由兩個向量描述的路徑(X,Y),如何用等距樣例(equidistant samples)對其進行取樣(sample)(★★★)?

(提示: np.cumsum, np.interp)

# Author: Bas Swinckels

phi = np.arange(0, 10*np.pi, 0.1)
a = 1
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)

dr = (np.diff(x)**2 + np.diff(y)**2)**.5 # segment lengths
r = np.zeros_like(x)
r[1:] = np.cumsum(dr)                # integrate path
r_int = np.linspace(0, r.max(), 200) # regular spaced path
x_int = np.interp(r_int, r, x)       # integrate path
y_int = np.interp(r_int, r, y)

99. 給定一個整數n 和一個二維陣列X,從X中選擇可以被解釋為從多n度的多項分散式的行,即這些行只包含整數對n的和. (★★★)

(提示: np.logical_and.reduce, np.mod)

# Author: Evgeni Burovski

X = np.asarray([[1.0, 0.0, 3.0, 8.0],
                [2.0, 0.0, 1.0, 1.0],
                [1.5, 2.5, 1.0, 0.0]])
n = 4
M = np.logical_and.reduce(np.mod(X, 1) == 0, axis=-1)
M &= (X.sum(axis=-1) == n)
print(X[M])

100. 對於一個一維陣列X,計算它boostrapped之後的95%置信區間的平均值. (★★★)

(提示: np.percentile)

# Author: Jessica B. Hamrick

X = np.random.randn(100) # random 1D array
N = 1000 # number of bootstrap samples
idx = np.random.randint(0, X.size, (N, X.size))
means = X[idx].mean(axis=1)
confint = np.percentile(means, [2.5, 97.5])
print(confint)