深度學習：多層感知機MLP數字識別的程式碼實現

阿新 • • 發佈：2019-02-14

深度學習我看的是neural network and deep learning 這本書，這本書寫的真的非常好，是我的導師推薦的。這篇部落格裡的程式碼也是來自於這，我最近是在學習Pytorch，學習的過程我覺得還是有必要把程式碼自己敲一敲，就像當初學習機器學習一樣。也是希望通過這個程式碼能夠加深對反向傳播原路的認識。

在下面的程式碼中，比較複雜的部分就是mini_batch部分了，我們一定要有清晰的認識，我們在上一篇部落格中給出了反向傳播方法四公式，而且在文章的末尾又給出了使用這四個公式的方法。

這裡寫圖片描述

所以牢牢記住這個個流程，一共分為：
1.輸入訓練集
2.前向傳播
3.計算輸出層產生的錯誤
4.反向傳播的錯誤
5.使用梯度下降，訓練引數

我們將在下面的程式碼中用以上的幾個數字（1到5）來表示當前程式碼是哪個過程

import random
import mnist_loader

import numpy as np

class Network(object):

def __init__(self, sizes):
    """The list ``sizes`` contains the number of neurons in the
    respective layers of the network.  For example, if the list
    was [2, 3, 1] then it would be a three-layer network, with the
    first layer containing 2 neurons, the second layer 3 neurons,
    and the third layer 1 neuron.  The biases and weights for the
    network are initialized randomly, using a Gaussian
    distribution with mean 0, and variance 1.  Note that the first
    layer is assumed to be an input layer, and by convention we
    won't set any biases for those neurons, since biases are only
    ever used in computing the outputs from later layers."""
    self.num_layers = len(sizes)
    self.sizes = sizes
    self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
    self.weights = [np.random.randn(y, x)
                    for x, y in zip(sizes[:-1], sizes[1:])]

前向傳播方法
def feedforward(self, a):
“”“Return the output of the network if a is input.”""
for b, w in zip(self.biases, self.weights):
a = sigmoid(np.dot(w, a)+b)
return a
隨機梯度方法
def SGD(self, training_data, epochs, mini_batch_size, eta,
test_data=None):
“”“Train the neural network using mini-batch stochastic
gradient descent. The training_data

is a list of tuples
(x, y) representing the training inputs and the desired
outputs. The other non-optional parameters are
self-explanatory. If test_data is provided then the
network will be evaluated against the test data after each
epoch, and partial progress printed out. This is useful for
tracking progress, but slows things down substantially.”""

    training_data = list(training_data)
    n = len(training_data)

    if test_data:
        test_data = list(test_data)
        n_test = len(test_data)

打亂資料，然後按照mini_batch大小取資料，在這裡有update_mini_batch方法，這裡面就是反向傳播的核心功能。還有evaluate方法，這裡包含feedforward方法。
for j in range(epochs):
random.shuffle(training_data)
mini_batches = [
training_data[k:k+mini_batch_size]
for k in range(0, n, mini_batch_size)]
for mini_batch in mini_batches:
self.update_mini_batch(mini_batch, eta)
if test_data:
print(“Epoch {} : {} / {}”.format(j,self.evaluate(test_data),n_test));
else:
print(“Epoch {} complete”.format(j))
在這個程式碼裡最核心的部分便是
def update_mini_batch(self, mini_batch, eta):
“”“Update the network’s weights and biases by applying
gradient descent using backpropagation to a single mini batch.
The mini_batch is a list of tuples (x, y), and eta
is the learning rate.”""
nabla_b = [np.zeros(b.shape) for b in self.biases]
nabla_w = [np.zeros(w.shape) for w in self.weights]
for x, y in mini_batch:
delta_nabla_b, delta_nabla_w = self.backprop(x, y)
nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
下面這個程式碼是屬於過程5，使用梯度下降，訓練引數
self.weights = [w-(eta/len(mini_batch))*nw
for w, nw in zip(self.weights, nabla_w)]
self.biases = [b-(eta/len(mini_batch))*nb
for b, nb in zip(self.biases, nabla_b)]

def backprop(self, x, y):
“”“Return a tuple (nabla_b, nabla_w) representing the
gradient for the cost function C_x. nabla_b and
nabla_w are layer-by-layer lists of numpy arrays, similar
to self.biases and self.weights.”""
nabla_b = [np.zeros(b.shape) for b in self.biases]
nabla_w = [np.zeros(w.shape) for w in self.weights]

    activation = x
    activations = [x] 
    zs = [] 
    for b, w in zip(self.biases, self.weights):
        z = np.dot(w, activation)+b
        zs.append(z)
        activation = sigmoid(z)
        activations.append(activation)
    下面這個程式碼是計算輸出層誤差
    delta = self.cost_derivative(activations[-1], y) * \
        sigmoid_prime(zs[-1])
    下面這個公式屬於反向傳播公式3
    nabla_b[-1] = delta
    下面這個公式屬於反向創博公式4
    nabla_w[-1] = np.dot(delta, activations[-2].transpose())
    for l in range(2, self.num_layers):
        z = zs[-l]
        sp = sigmoid_prime(z)
        delta = np.dot(self.weights[-l+1].transpose(), delta) * sp
        nabla_b[-l] = delta
        nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())
    return (nabla_b, nabla_w)

def evaluate(self, test_data):
    """Return the number of test inputs for which the neural
    network outputs the correct result. Note that the neural
    network's output is assumed to be the index of whichever
    neuron in the final layer has the highest activation."""
    test_results = [(np.argmax(self.feedforward(x)), y)
                    for (x, y) in test_data]
    return sum(int(x == y) for (x, y) in test_results)

def cost_derivative(self, output_activations, y):
    """Return the vector of partial derivatives \partial C_x /
    \partial a for the output activations."""
    return (output_activations-y)

def sigmoid(z):
“”“The sigmoid function.”""
return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
“”“Derivative of the sigmoid function.”""
return sigmoid(z)*(1-sigmoid(z))

if name == ‘main’:
training_data, validation_data, test_data = mnist_loader.load_data_wrapper()
net = Network([784, 30, 10])
net.SGD(training_data, 30, 10, 100.0, test_data=test_data)

深度學習：多層感知機MLP數字識別的程式碼實現

深度學習我看的是neural network and deep learning 這本書，這本書寫的真的非常好，是我的導師推薦的。這篇部落格裡的程式碼也是來自於這，我最近是在學習Pytorch，學習的過程我覺得還是有必要把程式碼自己敲一敲，就像當初學習機器學習一

深度學習筆記二：多層感知機（MLP）與神經網路結構

為了儘量能形成系統的體系，作為最基本的入門的知識，請參考一下之前的兩篇部落格：神經網路(一):概念神經網路(二):感知機上面的兩篇部落格讓你形成對於神經網路最感性的理解。有些看不懂的直接忽略就行，最基本的符號的記法應該要會。後面會用到一這兩篇部落格中

深度學習基礎--不同網路種類--多層感知機MLP

多層感知機MLP BP演算法的方面掀起了基於統計模型的機器學習熱潮，那時候人工神經網路被叫做“多層感知機” 可以擺脫早期離散傳輸函式的束縛，使用sigmoid或tanh等連續函式模擬神經元對激勵的響應，在訓練演算法上則使用Werbos發明的反向傳播B

Hulu機器學習問題與解答系列 | 十五：多層感知機與布爾函數

功能目標機器學習分享圖片研究 vue gic per 發展今天沒有別的話，好好學習，多多轉發！本期內容是【多層感知機與布爾函數】場景描述神經網絡概念的誕生很大程度上受到了神經科學的啟發。生物學研究表明，大腦皮層的感知與計算功能是通過分多層實現的

TensorFlow-多層感知機(MLP)

訓練感知 set equal () closed batch BE lac TensorFlow訓練神經網絡的4個步驟： 1、定義算法公式，即訓練神經網絡的forward時的計算 2、定義損失函數和選擇優化器來優化loss 3、訓練步驟 4、對模型進行準確率評測附Mul

多層感知機MLP的gluon版分類minist

MLP_Gluon

gluon 實現多層感知機MLP分類FashionMNIST

from mxnet import gluon,init from mxnet.gluon import loss as gloss, nn from mxnet.gluon import data as gdata from mxnet import nd,autograd import gl

【深度學習】多層感知器解決異或問題

利用Python 建立兩層感知器，利用W-H學習規則訓練網路權值： #!/usr/bin/env python # -*- coding:utf-8 -*- import random import numpy as np import matplotl

tensorflow 多層感知機MLP

tf.nn.dropout(x, keep_prob) x：指輸入 keep_prob: 設定神經元被選中的概率 import numpy as np import sklearn.preprocessing as prep import tensorflow as tf from te

多層感知機MLP

關於感知機： 1. 什麼是感知機（perceptron）感知機是最簡單的神經網路，具備神經網路的必備要素。感知機也稱為感知器，是一種雙層神經網路，即只有輸入層和輸出層而沒有隱層的神經網路。感知機是一種二類分類的線性分類器，其輸入為例項的特徵向量，輸出為例項

神經網路之多層感知機MLP的實現（Python+TensorFlow）

用 MLP 實現簡單的MNIST資料集識別。 # -*- coding:utf-8 -*- # # MLP """ MNIST classifier, 多層感知機實現 """ # Import

記一下機器學習筆記多層感知機的反向傳播演算法

《神經網路與機器學習》第4章前半段筆記以及其他地方看到的東西的混雜…第2、3章的內容比較古老預算先跳過。不得不說幸虧反向傳播的部分是《神機》裡邊人話比較多的部分，看的時候沒有消化不良。多層感知機書裡前三章的模型的侷限都很明顯，對於非線性可分問

Alink漫談(十四) ：多層感知機之總體架構

# Alink漫談(十四) ：多層感知機之總體架構 [TOC] ## 0x00 摘要 Alink 是阿里巴巴基於實時計算引擎 Flink 研發的新一代機器學習演算法平臺，是業界首個同時支援批式演算法、流式演算法的機器學習平臺。本文和下文將帶領大家來分析Alink中多層感知機的實現。因為Alink

深度學習基礎（二）—— 從多層感知機（MLP）到卷積神經網路（CNN）

經典的多層感知機（Multi-Layer Perceptron）形式上是全連線（fully-connected）的鄰接網路（adjacent network）。 That is, every neuron in the network is connec

深度學習Deeplearning4j 入門實戰（5）：基於多層感知機的Mnist壓縮以及在Spark實現

在上一篇部落格中，我們用基於RBM的的Deep AutoEncoder對Mnist資料集進行壓縮，應該說取得了不錯的效果。這裡，我們將神經網路這塊替換成傳統的全連線的前饋神經網路對Mnist資料集進行壓縮，看看兩者的效果有什麼異同。整個程式碼依然是利用Deeplearnin

深度學習第三課多層感知機

多層感知機針對於中間有隱藏層的神經網路結構，對於資料的每一層傳遞都會相應的多一層[w,b]，這中間使用的傳參函式稱之為啟用函式，如果啟用函式是線性的話，那麼就沒什麼意義，無論中間有多少層，其實和沒有隱藏層是一樣的，所以需要非線性的函式作為啟用函

【機器學習】基於sklearn-MLP多層感知機例項

在之前的【【深度學習】DNN神經網路模型】一文中弄清楚MLP基礎由來，本次進一步基於sklearn機器學習庫來實現該過程。首先看一下簡單的MLP例項：下面同樣基於手寫MNIST資料集來進行MLP例項： MLP引數眾多，以下一一說明： hidden_layer_sizes :元祖格式，長度

深度學習理論基礎6-多層感知機

廢話不多說，人生甜短，讓我們立即開始多層感知機的學習吧。為了迴圈漸進的理解多層感知機，我們有必要再把閘電路拿出來把玩一番。這些是閘電路的符號表示，我們馬上就用。你隨便記3秒鐘就好。吼吼，你是不是在想，難道這就是異或門？沒錯哦。這就是。不信你可以捋一下。是不是經過翻過來調

單層神經網路、多層感知機、深度學習的總結

關於神經網路的幾點思考：單層——多層——深度神經網路本質上是一個逼近器，一個重要的基本屬性就是通用逼近屬性。通用逼近屬性： 1989年，George Cybenko發表文章“Approximation by Superpositions of a Sigmoidal

TensorFlow學習筆記(二)：手寫數字識別之多層感知機

在【TensorFlow學習筆記(一)：手寫數字識別之softmax迴歸】中：我使用softmax迴歸演算法識別mnist資料集的手寫數字，在我機器上的mnist測試集上最好結果是 92.9% 。

深度學習：多層感知機MLP數字識別的程式碼實現

相關推薦