torch.nn.parallel.DistributedDataParallel 小結

阿新 • • 發佈：2021-01-14

技術標籤：pytorch

config新增

parser.add_argument('--local_rank', type=int, default=-1)

train中新增

import torch.distributed as dist
from torch.utils.data.distributed import DistributedSampler

在有寫操作時，注意判斷local_rank

初始化

dist.init_process_group(backend='nccl') 
torch.cuda.set_device(self.opt.local_rank)
torch.autograd.set_detect_anomaly(True) #檢查異常使用,訓練時需註釋掉
self.device = torch.device("cuda", self.opt.local_rank) if torch.cuda.is_available() else torch.device("cpu")

模型操作（用到batchnorm需要額外新增一項，每個模型注意新增GPU idx）

self.netD = torch.nn.SyncBatchNorm.convert_sync_batchnorm(self.netD)
self.netD = torch.nn.parallel.DistributedDataParallel(self.netD,
          find_unused_parameters=True,device_ids[self.opt.local_rank],output_device=self.opt.local_rank)

dataloader操作（shuffle不能設定為True，因為sampler自帶shuffle，testset可以不管）

rain_dataset = self.dataset(
            self.opt.data_path, train_filenames, self.opt.data_height, self.opt.data_width,
            self.opt.data_frame_ids, 4, is_train=True, img_ext=img_ext)
train_sampler = torch.utils.data.distributed.DistributedSampler(train_dataset)
self.train_loader = torch.utils.data.DataLoader(
            train_dataset, self.opt.batch_size, #shuffle = True,
            num_workers=self.opt.data_workers, pin_memory=True, drop_last=True, sampler=train_sampler)

訓練

export CUDA_VISIBLE_DEVICES=0,1
python -m torch.distributed.launch --nproc_per_node=2 train_ablation_multi.py

nproc_per_node 是用到幾個GPU

參考：

https://www.cnblogs.com/JunzhaoLiang/archive/2004/01/13/13535952.html

https://www.cnblogs.com/yh-blog/p/12877922.html

torch.nn.parallel.DistributedDataParallel 小結

技術標籤：pytorch config新增 parser.add_argument(\'--local_rank\', type=int, default=-1) train中新增

pytorch torch.nn.AdaptiveAvgPool2d()自適應平均池化函式詳解

如題：只需要給定輸出特徵圖的大小就好，其中通道數前後不發生變化。具體如下：

PyTorch裡面的torch.nn.Parameter()詳解

在看過很多部落格的時候發現了一個用法self.v = torch.nn.Parameter(torch.FloatTensor(hidden_size)),首先可以把這個函式理解為型別轉換函式，將一個不可訓練的型別Tensor轉換成可以訓練的型別parameter並將這個par

pytorch1.0中torch.nn.Conv2d用法詳解

Conv2d的簡單使用 torch 包 nn 中 Conv2d 的用法與 tensorflow 中類似，但不完全一樣。

torch.nn.Embedding進行word Embedding

torch.nn.Embedding 在pytorch裡面實現word embedding是通過一個函式來實現的:nn.Embedding import torch

PyTorch之 torch.nn.Embedding 詞嵌入層的理解

1.word Embedding的概念理解首先，我們先理解一下什麼是Embedding。Word Embedding翻譯過來的意思就是詞嵌入，通俗來講就是將文字轉換為一串數字。因為數字是計算機更容易識別的一種表達形式。我們詞嵌入的過程，就

PyTorch基礎——torch.nn.CrossEntropyLoss交叉熵損失

技術標籤：PyTorch交叉熵損失本文只考慮基本情況，未考慮加權。 torch.nnCrossEntropyLosss使用的公式

torch.nn.ModuleList筆記

技術標籤：零基礎學習SSD網路PyTorch實現《深度學習之PyTorch實戰計算機視覺》Deep-Learning-with-PyTorch

torch.nn.functional.pad()函式的使用

技術標籤：pytorch 文章目錄函式測試影象示例參考來源連結函式測試 import torch import torch.nn.functional as F

torch.nn.LogSoftmax用法

技術標籤：PytorchpytorchLogSoftmax LOGSOFTMAX CLASS torch.nn.LogSoftmax(dim: Optional[int] = None)

torch.nn.MSELoss用法

技術標籤：PytorchpytorchMSELOSS MSELOSS CLASS torch.nn.MSELoss(size_average=None,reduce=None,reduction: str = \'mean\')

torch.nn.L1Loss用法

技術標籤：PytorchL1Losspytorch L1LOSS CLASS torch.nn.L1Loss(size_average=None,reduce=None,reduction: str = \'mean\')

torch.nn.ConvTranspose2d()

技術標籤：python神經網路pytorch 轉載自https://blog.csdn.net/qq_39777550/article/details/108965144?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.control&dep

PyTorch學習筆記（二）——torch.nn解析

PyTorch提供了方便漂亮的類和模組，來幫助我們建立和訓練神經網路，例如 torch.nn, torch.optim 等。為了更好地理解這些模組的功能和原理，我們在手動搭建的神經網路上，逐步新增這些模組，以顯示每部分模組的

PyTorch之torch.nn.CrossEntropyLoss()

技術標籤：PyTorchpython深度學習機器學習演算法人工智慧簡介資訊熵：按照真實分佈p來衡量識別一個樣本所需的編碼長度的期望，即平均編碼長度交叉熵：使用擬合分佈q來表示來自真實分佈p的編碼長度的期望，即

torch.nn.functional中的interpolate插值函式

技術標籤：python零散知識總結python深度學習今年是農曆牛年第一天，在家的生活不快樂，那邊部落格學習叭。

torch.nn.NLLLoss()與torch.nn.CrossEntropyLoss()

技術標籤：pytorch torch.nn.NLLLoss() class torch.nn.NLLLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=\'mean\')

關於torch.nn.LSTM()的輸入和輸出

主角torch.nn.LSTM() 初始化時要傳入的引數 Args: input_size: The number of expected features in the input `x`

（2）torch.nn

1.Conv2d torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=\'zeros\', device=None, dtype=None)

Pytorch——torch.nn.Sequential()詳解

參考：官方文件原始碼官方文件 nn.Sequential 　　A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can als

torch.nn.parallel.DistributedDataParallel 小結

相關推薦