Python Numpy-基礎教程
阿新 • • 發佈:2019-01-04
目錄
1. 為什麼要學習numpy?
- numpy可以對整個array進行復雜計算,而不需要像list一樣寫loop
- 它的
ndarray
提供了快速的基於array的數值運算
- memory-efficient container that provides fast numerical operations
- 學習pandas的必備
證明numpy比list優秀:
import numpy as np my_arr = np.arange(1000000) my_list = list(range(1000000)) %time for _ in range(10): my_arr2 = my_arr * 2 # Wall time: 25 ms %time for _ in range(10): my_list2 = [x * 2 for x in my_list] # Wall time: 933 ms
2. Numpy基本用法
2.1. 建立np.ndarry
注意: numpy只能裝同類型的資料
# Method 1: np.array() ## 1-D a = np.array([1,2,3]) a.shape a.dtype # int32, boolean, string, float a.ndim ## 2-D a = np.array([[0,1,2],[3,4,5]]) # Method 2:使用函式(arange, linspace, ones, zeros, eys, diag,random)建立 a = np.arange(10) a = np.linspace(0,1,6, endpoint=False) a = np.ones((3,3)) a = np.zeros((3,3)) a = np.eye(3) a = np.diag(np.array([1,2,3,4])) a = np.triu(np.ones((3,3)),1) # Method 3: Random values a = np.random.rand(4) # unifomr in [0,1] a = np.random.randn(4) # Gaussian np.random.seed(1234)
2.2. Indexing and Slicing
- Slice create a view on the original array(change will affect original array)
# 1-D
a = np.arange(10)
a[5], a[-1] # Index: 4,9
a[5:8] = 12 # Slice: all 5-8 is set as 12
arr[5:8].copy() # Slice without view
# 2-D
a = np.ones((3,3))
a[2] # second row
a[2].copy() # slice without view
a[0][2] # special value
a[:2]
a[:2, 1:] = 0
Boolean Index
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)
data[names == 'Bob'] # select a row from data based on the if names equals Bob(boolean value)
data[~(names == 'Bob')] # not equal to Bob
data[(names == 'Bob') | (names == 'Will')] #e qual to Bob and Will
data[data<0] = 0
2.3. Universal Functions
a function that performs element-wise operations on data in ndarrays
a = np.arange(10)
b = np.arange(2,12)
# single
a + 1
a*2
np.sqrt(a)
np.exp(a)
np.sin(a)
# binary
a>b # return boolean ndarray
np.array_equal(a,b) # eual?
np.maximum(a, b) # find max value between each pair values
np.logical_or(a,b) # Attentions, a and b must be boolean array
2.4. Array-oriented
- Probelm 1
we wished to evaluate the function `sqrt(x^2 + y^2)`` across a regular grid of values.
The np.meshgrid
function takes two 1D arrays and produces two 2D matrices corresponding to all pairs of (x, y) in the two arrays:
points = np.arange(-5, 5, 0.01) # 1000 equally spaced points
xs, ys = np.meshgrid(points, points)
z = np.sqrt(xs ** 2 + ys ** 2)
import matplotlib.pyplot as plt
%matplotlib inline
plt.imshow(z, cmap=plt.cm.gray); plt.colorbar()
plt.title("Image plot of $\sqrt{x^2 + y^2}$ for a grid of values")
- Problem 2
we have two array(x,y)
and one boolean array, we want select x if boolean=True, while select y if boolean=False->np.where()
xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])
result = np.where(cond, xarr, yarr) # array([1.1, 2.2, 1.3, 1.4, 2.5])
np.where
的後面兩個引數可以是array,數字. 是數字的話就可以做替換工作,比如我們將隨機生成的array中大於0的替換為2,小於0的替換為-2
arr = np.random.randn(4, 4)
np.where(arr > 0, 2, -2) # 大於0改為2,小於0改為-2
np.where(arr > 0, 2, arr) # 大於0改為2,小於0不變
2.5. Mathematical Operations
a = np.random.randn(5, 4)
np.mean(a)
np.mean(a, axis = 1)
np.sum(a)
a.consum()
a.sort()
a.argmax() # index of maxium
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
np.unique(names)
sorted(set(names))