1. 程式人生 > >Python Numpy-基礎教程

Python Numpy-基礎教程

目錄


1. 為什麼要學習numpy?


  • numpy可以對整個array進行復雜計算,而不需要像list一樣寫loop
  • 它的ndarray提供了快速的基於array的數值運算
  • memory-efficient container that provides fast numerical operations
  • 學習pandas的必備

證明numpy比list優秀:

import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))

%time for _ in range(10): my_arr2 = my_arr * 2                  # Wall time: 25 ms
%time for _ in range(10): my_list2 = [x * 2 for x in my_list]   # Wall time: 933 ms

2. Numpy基本用法


2.1. 建立np.ndarry

注意: numpy只能裝同類型的資料

# Method 1: np.array()
## 1-D
a = np.array([1,2,3])
a.shape
a.dtype  # int32, boolean, string, float
a.ndim

## 2-D
a = np.array([[0,1,2],[3,4,5]])

# Method 2:使用函式(arange, linspace, ones, zeros, eys, diag,random)建立  
a = np.arange(10)
a = np.linspace(0,1,6, endpoint=False)
a = np.ones((3,3))
a = np.zeros((3,3))
a = np.eye(3)  
a = np.diag(np.array([1,2,3,4]))  
a = np.triu(np.ones((3,3)),1)

# Method 3: Random values
a = np.random.rand(4)   # unifomr in [0,1]
a = np.random.randn(4) # Gaussian
np.random.seed(1234)

2.2. Indexing and Slicing

  • Slice create a view on the original array(change will affect original array)
# 1-D
a = np.arange(10)
a[5], a[-1]    # Index: 4,9 
a[5:8] = 12    # Slice: all 5-8 is set as 12  
arr[5:8].copy()   # Slice without view  

# 2-D
a = np.ones((3,3))
a[2]           # second row
a[2].copy()    # slice without view
a[0][2]        # special value  

a[:2]
a[:2, 1:] = 0

Boolean Index

names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)

data[names == 'Bob']  # select a row from data based on the if names equals Bob(boolean value)
data[~(names == 'Bob')]    # not equal to Bob
data[(names == 'Bob') | (names == 'Will')]    #e qual to Bob and Will
data[data<0] = 0  

2.3. Universal Functions

a function that performs element-wise operations on data in ndarrays

a = np.arange(10)  
b = np.arange(2,12)

# single
a + 1
a*2 
np.sqrt(a)
np.exp(a)
np.sin(a)

# binary
a>b                     # return boolean ndarray
np.array_equal(a,b)     # eual? 
np.maximum(a, b)        # find max value between each pair values
np.logical_or(a,b)      # Attentions, a and b must be boolean array

2.4. Array-oriented

  • Probelm 1

we wished to evaluate the function `sqrt(x^2 + y^2)`` across a regular grid of values.

The np.meshgrid function takes two 1D arrays and produces two 2D matrices corresponding to all pairs of (x, y) in the two arrays:

points = np.arange(-5, 5, 0.01) # 1000 equally spaced points
xs, ys = np.meshgrid(points, points)
z = np.sqrt(xs ** 2 + ys ** 2)

import matplotlib.pyplot as plt
%matplotlib inline

plt.imshow(z, cmap=plt.cm.gray); plt.colorbar()
plt.title("Image plot of $\sqrt{x^2 + y^2}$ for a grid of values")

  • Problem 2

we have two array(x,y) and one boolean array, we want select x if boolean=True, while select y if boolean=False->np.where()

xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])
result = np.where(cond, xarr, yarr)        # array([1.1, 2.2, 1.3, 1.4, 2.5])

np.where的後面兩個引數可以是array,數字. 是數字的話就可以做替換工作,比如我們將隨機生成的array中大於0的替換為2,小於0的替換為-2

arr = np.random.randn(4, 4)
np.where(arr > 0, 2, -2)     # 大於0改為2,小於0改為-2
np.where(arr > 0, 2, arr)    # 大於0改為2,小於0不變

2.5. Mathematical Operations

a = np.random.randn(5, 4)  
np.mean(a)
np.mean(a, axis = 1)
np.sum(a)
a.consum()
a.sort()
a.argmax()  # index of maxium

names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
np.unique(names)
sorted(set(names))