吳恩達 神經網路和深度學習 第3周程式設計作業
由於csdn的markdown編輯器及其難用,已將本文轉移至此處
Note
These are my personal programming assignments at the third week after studying the course neural-networks-deep-learning and the copyright belongs to deeplearning.ai.
planar data classification with one hidden layer
1 Packages
Let’s first import all the packages that you will need during this assignment.
- numpy is the fundamental package for scientific computing with Python.
- sklearn provides simple and efficient tools for data mining and data analysis.
- matplotlib is a library for plotting graphs in Python.
- testCases_v2 provides some test examples to assess the correctness of your functions
- planar_utils provide various useful functions used in this assignment
1 2 3 4 5 6 7 8 9 10 11 12 | # Package imports import numpy as np; import matplotlib.pyplot as plt; import sklearn; import sklearn.datasets; import sklearn.linear_model; from testCases_v2 import *; from planar_utils import plot_decision_boundary, sigmoid, load_planar_dataset, load_extra_datasets; %matplotlib inline |
You can get the support code from here.
2 Dataset
First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | def load_planar_dataset(): #generate two random array X and Y np.random.seed(1) m=400 #樣本的數量 N=int(m/2) #每一類的數量,共有倆類資料 D=2 #維數,二維資料 X=np.zeros((m,D)) # 生成(m,2)獨立的樣本 Y=np.zeros((m,1),dtype='uint8') #生成(m,1)矩陣的樣本 a=4 #maximum ray of the flower for j in range(2): ix=range(N*j,N*(j+1)) #範圍在[N*j,N*(j+1)]之間 t=np.linspace(j*3.12,(j+1)*3.12,N)+np.random.randn(N)*0.2 #theta r=a*np.sin(4*t)+np.random.randn(N)*0.2 #radius X[ix]=np.c_[r*np.sin(t),r*np.cos(t)] # (m,2),使用np.c_是為了形成(m,2)矩陣 Y[ix]=j X=X.T #(2,m) Y=Y.T # (1,m) return X,Y |
Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data.
1 2 3 | X,Y = load_planar_dataset(); plt.scatter(X[0,:], X[1,:], c=np.squeeze(Y),s=40,cmap=plt.cm.Spectral); plt.show(); |
You have:
- a numpy-array (matrix) X that contains your features (x1, x2)
- a numpy-array (vector) Y that contains your labels (red:0, blue:1).
Lets first get a better sense of what our data is like.
Exercise: How many training examples do you have? In addition, what is the shape of the variables X and Y?
Hint: How do you get the shape of a numpy array? (help)
1 2 3 4 5 6 7 8 9 | ### START CODE HERE ### (≈ 3 lines of code) shape_X = X.shape shape_Y = Y.shape m = X.shape[1] # training set size ### END CODE HERE ### print ('The shape of X is: ' + str(shape_X)) print ('The shape of Y is: ' + str(shape_Y)) print ('I have m = %d training examples!' % (m)) |
The shape of X is: (2, 400)
The shape of Y is: (1, 400)
I have m = 400 training examples!
3 Simple Logistic Regression
Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset.
1 2 3 | # Train the logistic regression classifier clf = sklearn.linear_model.LogisticRegressionCV(); clf.fit(X.T, np.squeeze(Y.T)); |
You can now plot the decision boundary of these models. Run the code below.
1 2 3 4 5 6 7 8 | # Plot the decision boundary for logistic regression plot_decision_boundary(lambda x: clf.predict(x), X, Y) plt.title("Logistic Regression") # Print accuracy LR_predictions = clf.predict(X.T) print ('Accuracy of logistic regression: %d ' % float((np.dot(Y,LR_predictions) + np.dot(1-Y,1-LR_predictions))/float(Y.size)*100) + '% ' + "(percentage of correctly labelled datapoints)") |
Accuracy of logistic regression: 47 % (percentage of correctly labelled datapoints)
plot_decision_boundary:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | def plot_decision_boundary(model, X, y): # Set min and max values and give it some padding x_min, x_max = X[0, :].min() - 1, X[0, :].max() + 1 y_min, y_max = X[1, :].min() - 1, X[1, :].max() + 1 h = 0.01 # Generate a grid of points with distance h between them xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) # Predict the function value for the whole grid Z = model(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) # Plot the contour and training examples plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral) plt.ylabel('x2') plt.xlabel('x1') plt.scatter(X[0, :], X[1, :], c=np.squeeze(y), cmap=plt.cm.Spectral) |
4 Neural Network model
Logistic regression did not work well on the “flower dataset”. You are going to train a Neural Network with a single hidden layer.
Mathematically:
For one example x(i):
z[1](i)=W[1]x(i)+b[1](i)
a[1](i)=tanh(z[1](i))
z[2](i)=W[2]a[1](i)+b[2](i)
ˆy(i)=a[2](i)=σ(z[2](i))
$$y^{(i)}_{prediction} = \begin{cases} 1 & \mbox{if } a^{2} > 0.5 \\ 0 & \mbox{otherwise } \end{cases}\tag{5}$$
Given the predictions on all the examples, you can also compute the cost J as follows:
J=−1mm∑i=0(y(i)log(a[2](i))+(1−y(i))log(1−a[2](i)))
Reminder:
The general methodology to build a Neural Network is to:
- Define the neural network structure ( # of input units, # of hidden units, etc).
- Initialize the model’s parameters
- Loop:
- Implement forward propagation
- Compute loss
- Implement backward propagation to get the gradients
- Update parameters (gradient descent)
You often build helper functions to compute steps 1-3 and then merge them into one function we call nn_model()
. Once you’ve built nn_model()
and learnt the right parameters, you can make predictions on new data.
4.1 Defining the neural network structure
Exercise: Define three variables:
- nx : the size of the input layer
- nh : the size of the hidden layer (set this to 4)
- ny : the size of the output layer
Hint: Use shapes of X and Y to find nx and ny. Also, hard code the hidden layer size to be 4.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # GRADED FUNCTION: layer_sizes def layer_sizes(X, Y): """ Arguments: X -- input dataset of shape (input size, number of examples) Y -- labels of shape (output size, number of examples) Returns: n_x -- the size of the input layer n_h -- the size of the hidden layer n_y -- the size of the output layer """ ### START CODE HERE ### (≈ 3 lines of code) n_x = X.shape[0]; # size of input layer n_h = 4; n_y = Y.shape[0];# size of output layer ### END CODE HERE ### return (n_x, n_h, n_y); |
1 2 3 4 5 | X_assess, Y_assess = layer_sizes_test_case(); (n_x, n_h, n_y) = layer_sizes(X_assess, Y_assess); print("The size of the input layer is: n_x = " + str(n_x)); print("The size of the hidden layer is: n_h = " + str(n_h)); print("The size of the output layer is: n_y = " + str(n_y)); |
The size of the input layer is: n_x = 5
The size of the hidden layer is: n_h = 4
The size of the output layer is: n_y = 2
4.2 Initialize the model’s parameters
Exercise: Implement the function initialize_parameters()
.
Instructions:
- Make sure your parameters’ sizes are right. Refer to the neural network figure above if needed.
- You will initialize the weights matrices with random values.
- Use:
np.random.randn(a,b) * 0.01
to randomly initialize a matrix of shape (a,b). - You will initialize the bias vectors as zeros.
- Use:
np.zeros((a,b))
to initialize a matrix of shape (a,b) with zeros.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|