1. 程式人生 > >Pytorch入門初體驗(五)

Pytorch入門初體驗(五)

資料的平行計算(DataParallelism

在這個教程中,我們將會學習怎樣使用多塊GPU進行資料平行計算。

在PyTorch中使用GPU是非常簡單的,你可以將模型放到一塊GPU上:

device = torch.device("cuda:0")
model.to(device)

然後,你可以將所有的tensors複製到GPU上:

mytensor = my_tensor.to(device)

請注意my_tensor.to(device)返回的是my_tensor在GPU上的一個新備份而不是重寫my_tensor。你需要重新指定它到一個新的tensor上並且在GPU上使用這個tensor。

PyTorch

預設只會使用一塊GPU,你可以在多塊GPU上跑你的程式通過使你的模型用DataParallel平行計算:

model = nn.DataParallel(model)

下面將會具體的講解一些細節:

Imports andparameters

匯入pytorchmodules並定義引數:

import torch
import torch.nn as nn
from torch.ultis.data import Dataset, Dataloader

#parameter and DataLoaders
input_size = 5
output_size = 2
batch_size = 30
data_size = 100

Device

device = torch.device("cuda:0" if torch.cuda.is_available() else"cpu")

虛擬資料集(Dummy DataSet

做一個虛擬的或是隨機的資料集,你只需要執行getiterm

class RandomDataset(Dataset):
    def __init__(self, size,length):
        self.len = length
        self.data =torch.randn(length, size)      
        def __getitem__(self,index):
            return self.data[index]
     
        def __len__(self):
            return self.len
rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),
                       batch_size=batch_size, shuffle=True)

簡單模型(simple model

在這個demo中,我們的模型只有一個輸入,做一個線性操作,然後有一個輸出。但是,你可以在任意的模型上(CNN,RNN,Capsule Net等)使用DataParallel.

class Model(nn.Module):

    # Our model
    def __init__(self, input_size,output_size):
        super(Model,self).__init__()
        self.fc =nn.Linear(input_size, output_size)
    def forward(self, input):
        output = self.fc(input)
        print("\tIn Model:input size", input.size(), "output size", output.size())
        return output

建立一個模型並且資料並行(createmodel and dataparallel

這是這個教程的核心內容。

首先,你需要建立一個模型例項並且check一下你是否有多塊GPU,如果有多塊GPU,你就可以使用nn.DataParallel包裝模型;然後通過model.to(device)將我們的模型放到GPU上。

model = Model(input_size, output_size)
if torch.cuda.device_count() > 1:
    print("Let's use",torch.cuda.device_count(), "GPUs!")
    model = nn.DataParallel(model)   
model.to(device)

Run the Model

現在我們可以看到輸入和輸出tensors的sizes:

for data in rand_loader:
    input = data.to(device)
    output =  model(input)
    print("Outside: inputsize", input.size(), "Output_size", output.size())

Out:

In Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])


Results

如果你沒有GPU或只有一個GPU, 當我們batch30 inputs和30 outputs的時候,模型輸入和輸出都超過30。但是如果我們有多個GPU的話,我們獲得的結果如下:

2個GPU

# on 2 GPUs
Let's use 2 GPUs!
    In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
    In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
    In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
    In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
    In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

3個GPUs:

Let's use 3 GPUs!
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

8個GPUs:

Let's use 8 GPUs!
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

總結:

DataParallel splits your data automatically and sends job orders tomultiple models on several GPUs. After each model finishes their job,DataParallel collects and merges the results before returning it to you.