1. 程式人生 > >week2_Part1&2_LR with a Neural Network mindset_yhd

week2_Part1&2_LR with a Neural Network mindset_yhd

目前的學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記

3、做每週的作業練習,這個裡面的含金量非常高。掌握後一定要自己敲一遍,這樣以後用起來才能得心應手。

有需要全套作業練習notebook及全套資料的可以留言或者加我微信yuhaidong112

二分類

  • 預測輸出為正類(+1)的概率: (1)y^=P(y=1x)\hat{y} = P(y=1\mid x)\tag{1} (2)P(yx)=y^y(1y^)(1y^)P(y\mid x) = \hat{y}^y(1-\hat{y})^{(1-\hat{y})}\tag{2}

    1y^)(1y^)(2) (3)sigmoid(z)=σ(z)=11+ezsigmoid( z ) = \sigma(z) = \frac{1}{1 + e^{-z}}\tag{3}

  • 單樣本正向傳播: (1)a=y^=σ(wTx+b)a = \hat y = \sigma(w^T x + b)\tag{1} (2)a=a(1a)a\prime = a(1-a)\tag{2} (3)L(a,y)=ylog(a)(1y)log(1a) \mathcal{L}(a, y) = - y \log(a) - (1-y ) \log(1-a)\tag{3}

  • 單樣本反向傳播: (1)da=La=ya+1y1a\mathrm{d}a = \frac {\partial{L}}{\partial{a}} = -\frac {y}{a} + \frac {1-y}{1-a}\tag{1} (2)dz=La×az=ay\mathrm{d}z = \frac {\partial{L}}{\partial{a}} \times \frac {\partial{a}}{\partial{z}} = a-y\tag{2}

    aL×za=ay(2) (3)dw=dz×zw=(ay)x\mathrm{d}w = \mathrm{d}z \times \frac {\partial{z}}{\partial{w}} = (a-y)x\tag{3} (4)db=dz×zb=(ay)\mathrm{d}b = \mathrm{d}z \times \frac {\partial{z}}{\partial{b}} = (a-y)\tag{4}

  • 多樣本的cost function和反向傳播: (1)J=1mi=1mL(a(i),y(i))=1mi=1m(y(i)log(a(i))+(1y(i))log(1a(i))) J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})=-\frac{1}{m} \sum_{i=1}^m (y^{(i)} \log(a^{(i)}) + (1-y^{(i)} ) \log(1-a^{(i)}))\tag{1}

(2)dw=Jw=1mi=1m[x(i)×(a(i)y(i))]=1mX(AY)T dw = \frac{\partial J}{\partial w} = \frac{1}{m} \sum_{i=1}^m [x^{(i)}\times(a^{(i)}-y^{(i)})]= \frac{1}{m}X(A-Y)^T\tag{2} (3)db=Jb=1mi=1m(a(i)y(i)) db = \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{3}

  • 梯度下降法: (1)w=wα×dww = w - \alpha\times \mathrm{d}w\tag{1} (2)b=bα×dbb = b - \alpha\times \mathrm{d}b\tag{2}

Part 1:Python Basics with Numpy (optional assignment)

Building basic functions with numpy

Numpy is the main package for scientific computing in Python. It is maintained by a large community (www.numpy.org). In this exercise you will learn several key numpy functions such as np.exp, np.log, and np.reshape. You will need to know how to use these functions for future assignments.

1.1 - sigmoid function, np.exp()

Exercise: Build a function that returns the sigmoid of a real number x. Use math.exp(x) for the exponential function.

Reminder:

sigmoid(x)=11+exsigmoid( x ) = \frac{1}{1 + e^{-x}} is sometimes also known as the logistic function. It is a non-linear function used not only in Machine Learning (Logistic Regression), but also in Deep Learning.

在這裡插入圖片描述

To refer to a function belonging to a specific package you could call it using package_name.function(). Run the code below to see an example with math.exp().

#GRADED FUNCTION:basic_sigmoid
import math

def basic_sigmoid(x):
    '''
    compute sigmoid of x
    
    Arguments:
    x --- A scalar
    
    returns:
    s --- sigmoid(x)
    '''
    
    s = 1.0/(1+1/math.exp(x))
    return s
basic_sigmoid(3)

Actually, we rarely use the “math” library in deep learning because the inputs of the functions are real numbers. In deep learning we mostly use matrices and vectors. This is why numpy is more useful.

One reason why we use “numpy” instead of “math” in Deep Learning

x = [1, 2, 3] basic_sigmoid(x) # you will see this give an error when you run it, because x is a vector.

In fact, if x=(x1,x2,...,xn)x=(x_1,x_2,...,x_n) is a row vector then np.exp(x)np.exp(x) will apply the exponential function to every element of x. The output will thus be: np.exp(x)=(ex1,ex2,...,exn)np.exp(x)=(e^{x_1},e^{x_2},...,e^{x_n})

import numpy as np
# example of np.exp()
x = np.array([1,2,3])
print (np.exp(x))

Furthermore, if x is a vector, then a Python operation such as s=x+3s=x+3 or s=1xs=\frac{1}{x} will output s as a vector of the same size as x.

# example of vector operation
x = np.array([1, 2, 3])
print (x + 3)

Exercise: Implement the sigmoid function using numpy.

Instructions: x could now be either a real number, a vector, or a matrix. The data structures we use in numpy to represent these shapes (vectors, matrices…) are called numpy arrays. You don’t need to know more for now. For xRnsigmoid(x)=sigmoid(x1x2...xn)=(11+ex111+ex2...11+exn) \text{For } x \in \mathbb{R}^n \text{, } sigmoid(x) = sigmoid\begin{pmatrix} x_1 \\ x_2 \\ ... \\ x_n \\ \end{pmatrix} = \begin{pmatrix} \frac{1}{1+e^{-x_1}} \\ \frac{1}{1+e^{-x_2}} \\ ... \\ \frac{1}{1+e^{-x_n}} \\ \end{pmatrix}

相關推薦

week2_Part1&2_LR with a Neural Network mindset_yhd

目前的學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。掌握後一定要自己敲一遍,這樣以後用起來才能得心應手。 有需要全套作業練習notebook及全套資料的可以留言或者加我微信yuhaidong112 二分類

1.2 Logistic Regression with a Neural Network mindset

  關於為何開始在部落格上做筆記,一是為了強迫自己多進行總結,不能學過,就放過,得過且過;二是為了鍛鍊自己的文筆,好讓自己以後在寫論文的時候不至於寫出語法不暢的糟糕文章。這一專欄主要是記錄我在做作業過程中的一些感想,或者是總結。希望通過藉助部落格的形式,能讓我不斷地掃清知識點

吳恩達deeplearning作業-Logistic Regression with a Neural Network

Logistic Regression with a Neural Network作業需要用到的資料資料 Logistic Regression with a Neural Network mindset Welcome to your first (r

論文《Chinese Poetry Generation with Recurrent Neural Network》閱讀筆記

code employ 是個 best rec AS Coding ack ase 這篇文章是論文‘Chinese Poetry Generation with Recurrent Neural Network’的閱讀筆記,這篇論文2014年發表在EMNLP。 ABSTRA

1503.02531-Distilling the Knowledge in a Neural Network.md

gets 任務 其中 不一致 ans softmax special abi use 原來交叉熵還有一個tempature,這個tempature有如下的定義: $$ q_i=\frac{e^{z_i/T}}{\sum_j{e^{z_j/T}}} $$ 其中T就是temp

計算機視覺學習記錄 - Implementing a Neural Network from Scratch - An Introduction

dict 實踐 {} ann gen lua tps rst 損失函數 0 - 學習目標   我們將實現一個簡單的3層神經網絡,我們不會仔細推到所需要的數學公式,但我們會給出我們這樣做的直觀解釋。註意,此次代碼並不能達到非常好的效果,可以自己進一步調整或者完成課後練習來進行

蒸餾神經網路(Distill the Knowledge in a Neural Network) 論文筆記 蒸餾神經網路(Distill the Knowledge in a Neural Network) 論文筆記

轉 蒸餾神經網路(Distill the Knowledge in a Neural Network) 論文筆記 2017年08月06日 16:19:48 haoji00

Poker rule induction by a neural network

My first ML project is a neural network that would say Hello to everyone coming to our office and make coffee in the mornings. Though since I haven’t don

DL4J: How to create a neural network that draws images

Neural networks, machine learning, artificial intelligence – I get the impression that these slogans attack us from everywhere. They are mainly associated

Let’s code a Neural Network in plain NumPy

Let’s code a Neural Network in plain NumPyMysteries of Neural Networks Part IIIUsing high-level frameworks like Keras, TensorFlow or PyTorch allows us to b

Let's code a Neural Network in plain NumPy

Using high-level frameworks like Keras, TensorFlow or PyTorch allows us to build very complex models quickly. However, it is worth taking the time to look

在神經網路中提取知識 [Distilling the Knowledge in a Neural Network]

論文題目:Distilling the Knowledge in a Neural Network 思想總結: 深度神經網路對資訊的提取有著很強的能力,可以從大量的資料中學習到有用的知識,比如學習如何將手寫數字圖片進行0~9的分類。 層數越多(越深),神經單元個數越多的網路,可以在大

論文:用RNN書寫及識別漢字, Drawing and Recognizing Chinese Characters with Recurrent Neural Network

論文地址:用RNN書寫及識別漢字 摘要 目前識別漢字的通常方法是使用CNN模型,而識別線上(online)漢字時,CNN需要將線上手寫軌跡轉換成像影象一樣的表示。文章提出RNN框架,結合LSTM和GRU。包括識別模型和生成模型(即自動生成手寫體漢字),基於端到端,直接處理序列結構,不

【論文筆記】FOTS: Fast Oriented Text Spotting with a Unified Network

pdf連結:https://arxiv.org/pdf/1801.01671.pdf資料集的相關情況:1.ICDAR2013ICDAR2013包括四個資料夾,分別是:訓練影象集:Challenge2_Training_Task12_Images訓練標註集:Challenge2

知識蒸餾(Distillation)相關論文閱讀(1)——Distilling the Knowledge in a Neural Network(以及程式碼復現)

———————————————————————————————《Distilling the Knowledge in a Neural Network》Geoffrey Hintion以往為了提高模型表現所採取的方法是對同一個資料集訓練出多個模型,再對預測結果進行平均;但通

蒸餾神經網路(Distill the Knowledge in a Neural Network)

本文是閱讀Hinton 大神在2014年NIPS上一篇論文:蒸餾神經網路的筆記,特此說明。此文讀起來很抽象,大篇的論述,鮮有公式和圖表。但是鑑於和我的研究方向:神經網路的壓縮十分相關,因此決定花氣力好好理解一下。  1、Introduction   文章開篇用一個比喻來引

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition

機器學習 屬於 瓶頸 特征 oid ack enter 變換 表示 基於貝葉斯的深度神經網絡自適應及其在魯棒自動語音識別中的應用 直接貝葉斯DNN自適應 使用高斯先驗對DNN進行MAP自適應 為何貝葉斯在模型自適應中很有用? 因為自適應問題可以視為後驗估計

論文學習 | 利用塊分割資訊增強壓縮視訊質量:Enhancing HEVC Compressed Videos with a Partition-Masked Convolutional Neural Network

目錄 一、亮點 二、網路 三、Mask 及其融合 四、結論 一、亮點 提出 partition-masked Convolutin Neural Network (CNN) ,用以提升 HEVC 壓縮視訊的質量。 其亮點在於:該網路利用編碼端提供的塊分割資訊,在解碼端進行質量增強。

Automatic Segmentation of MR Brain Images With a Convolutional Neural Network

一,資料: 二,方法: 使用不同patch size大小的原因:大的patch包含空間資訊,可以定位到這個畫素位於影象中的位置(使用大的kernel);小的patch提供區域性相鄰畫素的細節資訊(使用小的kernel)。 每類訓練數量相同,防止

論文筆記:TextBoxes: A Fast Text Detector with a Single Deep Neural Network

在自然場景中,場景文字(Scene text)是最常見的視覺物件(visual objects)之一。經常出現在路標,車牌,產品包裝袋上等等。閱讀場景文字產生了很多有用的應用,例如基於圖片的地理定位(image-basedgeolocation)。儘管它和傳統的OCR很相似,但是場景文字的閱讀更具有挑戰性,因