Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth

阿新 • • 發佈：2020-12-17

Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth
- 1. Pytorch模型保存於讀取
  - a. 儲存、載入權重
  - b.儲存、載入網路和權重
- 2. Pytorch模型結構

Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth

1. Pytorch模型保存於讀取

a. 儲存、載入權重

# 模型儲存（僅儲存權重）
torch.save(model_object.state_dict(), './weights.pth')
# 模型載入（先建立模型，、再匯入權重）
model = AlexNet(**kwargs)
model.load_state_dict(torch.load('./weights.pth'))

b.儲存、載入網路和權重

# 模型儲存（僅儲存權重）
torch.save(model_object, './model.pth')
# 模型載入（先建立模型，、再匯入權重）
model = torch.load('./model.pth')

2. Pytorch模型結構

Pytorch生成的檔案為.pth或.pt

1). summary檢視網路整體結構

首先安裝torchsummary pip install torchsummary
以AelxNet為例，載入預訓練模型，檢視網路結構

import torch
from torch.autograd import Variable
from torchvision.models.alexnet import alexnet
from torchsummary import summary
if __name__=='__main__':
    name='alexnet'
    net=alexnet(True)
    print(type(net))               #<class 'torchvision.models.alexnet.AlexNet'>
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = net.to(device)
    summary(model, (3,227,227))
"""
# 網路結構
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 64, 56, 56]          23,296
              ReLU-2           [-1, 64, 56, 56]               0
         MaxPool2d-3           [-1, 64, 27, 27]               0
            Conv2d-4          [-1, 192, 27, 27]         307,392
              ReLU-5          [-1, 192, 27, 27]               0
         MaxPool2d-6          [-1, 192, 13, 13]               0
            Conv2d-7          [-1, 384, 13, 13]         663,936
              ReLU-8          [-1, 384, 13, 13]               0
            Conv2d-9          [-1, 256, 13, 13]         884,992
             ReLU-10          [-1, 256, 13, 13]               0
           Conv2d-11          [-1, 256, 13, 13]         590,080
             ReLU-12          [-1, 256, 13, 13]               0
        MaxPool2d-13            [-1, 256, 6, 6]               0
AdaptiveAvgPool2d-14            [-1, 256, 6, 6]               0
          Dropout-15                 [-1, 9216]               0
           Linear-16                 [-1, 4096]      37,752,832
             ReLU-17                 [-1, 4096]               0
          Dropout-18                 [-1, 4096]               0
           Linear-19                 [-1, 4096]      16,781,312
             ReLU-20                 [-1, 4096]               0
           Linear-21                 [-1, 1000]       4,097,000
================================================================
Total params: 61,100,840
Trainable params: 61,100,840
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.59
Forward/backward pass size (MB): 8.49
Params size (MB): 233.08
Estimated Total Size (MB): 242.16
----------------------------------------------------------------
"""

2). net.state_dict()解析權重值

net.state_dict()返回字典，key為layer名稱，value為weights與bias

只有那些引數可以訓練的layer才會被儲存到模型的state_dict中

import torch
from torch.autograd import Variable
from torchvision.models.alexnet import alexnet
from torchsummary import summary
if __name__=='__main__':
    name='alexnet'
    net=alexnet(True)
    print(type(net.state_dict()))  #<class 'collections.OrderedDict'>
    # 只有那些引數可以訓練的layer才會被儲存到模型的state_dict中,如卷積層,線性層等等，像什麼池化層、BN層這些本身沒有引數的層是沒有在這個字典中的；
    for param_tensor in net.state_dict(): # 字典的遍歷預設是遍歷 key，所以param_tensor實際上是鍵值
        print(param_tensor,'\t',net.state_dict()[param_tensor].size())
"""
features.0.weight        torch.Size([64, 3, 11, 11])
features.0.bias          torch.Size([64])
features.3.weight        torch.Size([192, 64, 5, 5])
features.3.bias          torch.Size([192])
features.6.weight        torch.Size([384, 192, 3, 3])
features.6.bias          torch.Size([384])
features.8.weight        torch.Size([256, 384, 3, 3])
features.8.bias          torch.Size([256])
features.10.weight       torch.Size([256, 256, 3, 3])
features.10.bias         torch.Size([256])
classifier.1.weight      torch.Size([4096, 9216])
classifier.1.bias        torch.Size([4096])
classifier.4.weight      torch.Size([4096, 4096])
classifier.4.bias        torch.Size([4096])
classifier.6.weight      torch.Size([1000, 4096])
classifier.6.bias        torch.Size([1000])
"""

3). net.named_parameters()獲取layer和weight

import torch
from torch.autograd import Variable
from torchvision.models.alexnet import alexnet
from torchsummary import summary
if __name__=='__main__':
    name='alexnet'
    net=alexnet(True)
    # 網路引數
    for layer in net.named_parameters():
        layer_name = layer[0]
        layer_weight = layer[1].size()
        print(layer_name,'   ',layer_weight)
"""
features.0.weight     torch.Size([64, 3, 11, 11])
features.0.bias     torch.Size([64])
features.3.weight     torch.Size([192, 64, 5, 5])
features.3.bias     torch.Size([192])
features.6.weight     torch.Size([384, 192, 3, 3])
features.6.bias     torch.Size([384])
features.8.weight     torch.Size([256, 384, 3, 3])
features.8.bias     torch.Size([256])
features.10.weight     torch.Size([256, 256, 3, 3])
features.10.bias     torch.Size([256])
classifier.1.weight     torch.Size([4096, 9216])
classifier.1.bias     torch.Size([4096])
classifier.4.weight     torch.Size([4096, 4096])
classifier.4.bias     torch.Size([4096])
classifier.6.weight     torch.Size([1000, 4096])
classifier.6.bias     torch.Size([1000])
"""

4). net.named_modules()

import torch
from torch.autograd import Variable
from torchvision.models.alexnet import alexnet
from torchsummary import summary
if __name__=='__main__':
    name='alexnet'
    net=alexnet(True)
    for name,layer in net.named_modules():
        print(name,'-->',layer)
"""
 --> AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Dropout(p=0.5, inplace=False)
    (4): Linear(in_features=4096, out_features=4096, bias=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)
features --> Sequential(
  (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
  (1): ReLU(inplace=True)
  (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (4): ReLU(inplace=True)
  (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): ReLU(inplace=True)
  (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (9): ReLU(inplace=True)
  (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (11): ReLU(inplace=True)
  (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
features.0 --> Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
features.1 --> ReLU(inplace=True)
features.2 --> MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
features.3 --> Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
features.4 --> ReLU(inplace=True)
features.5 --> MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
features.6 --> Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
features.7 --> ReLU(inplace=True)
features.8 --> Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
features.9 --> ReLU(inplace=True)
features.10 --> Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
features.11 --> ReLU(inplace=True)
features.12 --> MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
avgpool --> AdaptiveAvgPool2d(output_size=(6, 6))
classifier --> Sequential(
  (0): Dropout(p=0.5, inplace=False)
  (1): Linear(in_features=9216, out_features=4096, bias=True)
  (2): ReLU(inplace=True)
  (3): Dropout(p=0.5, inplace=False)
  (4): Linear(in_features=4096, out_features=4096, bias=True)
  (5): ReLU(inplace=True)
  (6): Linear(in_features=4096, out_features=1000, bias=True)
)
classifier.0 --> Dropout(p=0.5, inplace=False)
classifier.1 --> Linear(in_features=9216, out_features=4096, bias=True)
classifier.2 --> ReLU(inplace=True)
classifier.3 --> Dropout(p=0.5, inplace=False)
classifier.4 --> Linear(in_features=4096, out_features=4096, bias=True)
classifier.5 --> ReLU(inplace=True)
classifier.6 --> Linear(in_features=4096, out_features=1000, bias=True)
"""

Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth

目錄Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth1. Pytorch模型保存於讀取a. 儲存、載入權重b.儲存、載入網路和權重2. Pytorch模型結構1). summary檢視網路整體結構2). net.state_dict()解析權重值3). net.named

Pytorch_模型轉Caffe（三）pytorch轉caffemodel

目錄Pytorch_模型轉Caffe（三）pytorch轉caffemodel1. Pytorch下生成模型2. pth轉換成caffemodel和prototxt3. pytorch_to_caffe_alexNet.py剖析4. 用轉換後的模型進行推理5. prototxt注意問題

Python合集之Python跳轉語句（二）

在上一節的合集中，我們瞭解了Python跳轉語句中的break語句的相關知識，本節我們將進一步瞭解一下Python跳轉語句中的continue及Pass語句的相關知識。

Redis基礎篇（二）高效能IO模型

我們經常聽到說Redis是單執行緒的，也會有疑問：為什麼單執行緒的Redis能那麼快？

使用者行為分析模型實踐（二）—— 漏斗分析模型

作者：vivo 網際網路大資料團隊- Wu Yonggang 在《使用者行為分析模型實踐（一）—— 路徑分析模型》中，講述了基於平臺化查詢中查詢時間短、需要視覺化的要求，並結合現有的儲存計算資源以及具體需求，我們在實現

轉：ThreadLocal系列（二）-InheritableThreadLocal的使用及原理解析

轉：https://www.cnblogs.com/hama1993/p/10400265.html 一、基本使用我們繼續來看之前寫的例子：

Dubbo原始碼解析（二）Dubbo擴充套件機制SPI

Dubbo擴充套件機制SPI 前一篇文章《dubbo原始碼解析（一）Hello,Dubbo》是對dubbo整個專案大體的介紹，而從這篇文章開始，我將會從原始碼來解讀dubbo再各個模組的實現原理以及特點，由於全部由截圖的方式去解讀原始碼

深入理解java併發程式設計基礎篇（二）-------執行緒、程式、Java記憶體模型

一、前言通過前面的學習，我們瞭解到一些關於併發程式設計的一些基本概念，這一篇將繼續總結以及複習基礎篇的內容。

Java NIO之理解I/O模型（二）

前言上一篇文章講解了I/O模型的一些基本概念，包括同步與非同步，阻塞與非阻塞，同步IO與非同步IO，阻塞IO與非阻塞IO。這次一起來瞭解一下現有的幾種IO模型，以及高效IO的兩種設計模式，也都是屬於IO模型的基礎知識

RocketMQ深度解析（二）：NameServer

NamerServer NameServer是一個非常簡單的Topic路由註冊中心，其角色類似Dubbo中的zookeeper，支援Broker的動態註冊與發現。主要包括兩個功能：Broker管理，NameServer接受Broker叢集的註冊資訊並且儲存下來作為路由資

圖片轉字元圖片（二）

序言這個是從抖音上學來的，一開始刷抖音，遇到不少字串跳舞的視訊，因此來實踐一下

JavaScript ECMA-262-3 深入解析（二）：變數物件例項詳解

本文例項講述了JavaScript ECMA-262-3變數物件。分享給大家供大家參考，具體如下：

微信小程式開發（二）：頁面跳轉並傳參操作示例

本文例項講述了微信小程式頁面跳轉並傳參操作。分享給大家供大家參考，具體如下：

Hadoop基礎（二十九）：資料清洗（ETL）（二）複雜解析版

資料清洗案例實操-複雜解析版 1．需求對Web訪問日誌中的各欄位識別切分，去除日誌中不合法的記錄。根據清洗規則，輸出過濾後的資料。

[轉]mui初級入門教程（二）— html5+ webview 實現底部欄切換用法詳解

原文：https://ask.dcloud.net.cn/article/650 ==========-----------------------------------------------------------------------------------------------------------------------

xavier NX編譯caffe錯誤記錄（二）

由於某種原因對xavier NX重新刷機了，然後重新編譯caffe，再次重新記錄下編譯caffe過程中遇到的錯誤，解決錯誤的過程中很多都是用到了apt-get安裝一些依賴庫，因此最好先更改xavier NX的源頭，更該方法如下：

PDF檔案解析&拆分在SAP憑證列印場景中的運用（二）

　　小爬上篇文章分析了，SAP憑證批量列印場景中為啥要用到PDF檔案解析&拆分。這篇文章，緊接著上一篇，重點談談如何用python來做到高效的PDF檔案解析&拆分。

prometheus告警規則設定（二）【轉】

告警規則的設定是通過yml檔案來設定，因此需要遵從yml的語法 groups: -name:example#報警規則組的名字

mybatis 啟動流程原始碼分析（二）之 Configuration-Properties解析

一. 配置檔案參考： https://www.cnblogs.com/wanthune/p/13674243.html 二. 原始碼解析 XMLConfigBuilder 就是解析Xml的主類。

基於Dubbo框架構建分散式服務（二）【轉】

>>> Dubbo是Alibaba開源的分散式服務框架，我們可以非常容易地通過Dubbo來構建分散式服務，並根據自己實際業務應用場景來選擇合適的叢集容錯模式，這個對於很多應用都是迫切希望的，只需要通過

Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth

Pytorch_模型轉Caffe（二）解析Pytorch模型*.pth

1. Pytorch模型保存於讀取

a. 儲存、載入權重

b.儲存、載入網路和權重

2. Pytorch模型結構

1). summary檢視網路整體結構

2). net.state_dict()解析權重值

3). net.named_parameters()獲取layer和weight

4). net.named_modules()

相關推薦