A Beginner's Guide To Understanding Convolutional Neural Networks Part One 筆記
原文鏈接:https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner‘s-Guide-To-Understanding-Convolutional-Neural-Networks/
借這篇文章進行卷積神經網絡的初步理解(Convolutional Nerual Networks)
Image Classification
Image classification(圖像分類) is the task of taking an input image and outputting a class(a dog, a cat, ect.) or a probablity of classes that best describes the image.
Inputs and Outputs
When a computer sees an image, it will see an array of pixel values, e.g. 32*32*3, RGB(red,green,blue) values.
/****補充****/
單通道圖:俗稱灰度圖,每個像素點只能有一個值表示顏色,像素值在0-255之間(0是黑色,255是白色,中間值是一些不同等級的灰色)。
三通道圖(RGB):每個像素點有三個值表示,對紅、綠、藍三個顏色的通道值變化以及它們之間的相互疊加來得到各種各樣的顏色。三通道灰度圖指的是三個通道的值相同。
Biological Connection
某些神經元只對特定方向的邊緣做出響應,一些神經元只對垂直方向做出響應,一些只對水平方向等。這些神經元都在一個柱狀組織裏(人眼中的光感受器:柱狀體,對事物有一個總體感知),是卷積神經網絡的基礎。
First Layer - Math Part(Convolutional Layer aka conv layer)
The filter(or a neuron神經元/kernel核) has an array of numbers,called weights or parameters. The filter is convolving, next step(stride) is moving to the right by 1 unit.
The depth of this filter has to be the same as the depth of the input, so the filter is 5*5*3. If we use two filters(5*5*3), the output would be 28*28*2.
First Layer - High Level Perspective
Each of these filters can be thought of as feature identifiers(straight edges, colors, curves ect.).
E.g. a curve detector
The filter will have a pixel structure in which there will be higher numerical values along the area that is a shape of a curve.
So we take this image as example.
(可見第一幅圖匹配度高,第二幅匹配度低)
Going Deeper Through the Network
A classic CNN architecture would look like this:
Input -> Conv -> ReLU -> Conv -> ReLU -> Pool -> ReLU -> Conv -> ReLU -> Pool -> Fully Connected Layer
(ReLU:激活函數,Pool:池化層)
There‘re other layers that are interspersed(點綴,散布) between these conv layers, they provide nonlinearities (ReLU) and preservation(維度保護) of dimension(Pool) that help to improve the robustness(魯棒性) of the network and control overfitting.
As you go through more and more conv layers,(i).you get activation maps that represent more and more complex features;(ii).the filters begin to have a larger and larger receptive field.
Fully Connected Layer(FC)
全連接層在整個網絡中起到分類器的作用,可用卷積實現。
目前全連接由於參數冗余(僅全連接層參數就可占整個網絡參數80%左右),近期有使用全局平均池化(global average pooling,GAP),通常有較好的預測性能。
A Beginner's Guide To Understanding Convolutional Neural Networks Part One 筆記