Pytorch入門初體驗(五)
資料的平行計算(DataParallelism)
在這個教程中,我們將會學習怎樣使用多塊GPU進行資料平行計算。
在PyTorch中使用GPU是非常簡單的,你可以將模型放到一塊GPU上:
device = torch.device("cuda:0")
model.to(device)
然後,你可以將所有的tensors複製到GPU上:
mytensor = my_tensor.to(device)
請注意my_tensor.to(device)返回的是my_tensor在GPU上的一個新備份而不是重寫my_tensor。你需要重新指定它到一個新的tensor上並且在GPU上使用這個tensor。
PyTorch
model = nn.DataParallel(model)
下面將會具體的講解一些細節:
Imports andparameters
匯入pytorchmodules並定義引數:
import torch import torch.nn as nn from torch.ultis.data import Dataset, Dataloader #parameter and DataLoaders input_size = 5 output_size = 2 batch_size = 30 data_size = 100
Device
device = torch.device("cuda:0" if torch.cuda.is_available() else"cpu")
虛擬資料集(Dummy DataSet)
做一個虛擬的或是隨機的資料集,你只需要執行getiterm
class RandomDataset(Dataset): def __init__(self, size,length): self.len = length self.data =torch.randn(length, size) def __getitem__(self,index): return self.data[index] def __len__(self): return self.len rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size), batch_size=batch_size, shuffle=True)
簡單模型(simple model)
在這個demo中,我們的模型只有一個輸入,做一個線性操作,然後有一個輸出。但是,你可以在任意的模型上(CNN,RNN,Capsule Net等)使用DataParallel.
class Model(nn.Module):
# Our model
def __init__(self, input_size,output_size):
super(Model,self).__init__()
self.fc =nn.Linear(input_size, output_size)
def forward(self, input):
output = self.fc(input)
print("\tIn Model:input size", input.size(), "output size", output.size())
return output
建立一個模型並且資料並行(createmodel and dataparallel)
這是這個教程的核心內容。
首先,你需要建立一個模型例項並且check一下你是否有多塊GPU,如果有多塊GPU,你就可以使用nn.DataParallel包裝模型;然後通過model.to(device)將我們的模型放到GPU上。
model = Model(input_size, output_size)
if torch.cuda.device_count() > 1:
print("Let's use",torch.cuda.device_count(), "GPUs!")
model = nn.DataParallel(model)
model.to(device)
Run the Model
現在我們可以看到輸入和輸出tensors的sizes:
for data in rand_loader:
input = data.to(device)
output = model(input)
print("Outside: inputsize", input.size(), "Output_size", output.size())
Out:
In Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([30, 5]) output size torch.Size([30, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
Results
如果你沒有GPU或只有一個GPU, 當我們batch30 inputs和30 outputs的時候,模型輸入和輸出都超過30。但是如果我們有多個GPU的話,我們獲得的結果如下:
2個GPU:
# on 2 GPUs
Let's use 2 GPUs!
In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
3個GPUs:
Let's use 3 GPUs!
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
8個GPUs:
Let's use 8 GPUs!
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
總結:
DataParallel splits your data automatically and sends job orders tomultiple models on several GPUs. After each model finishes their job,DataParallel collects and merges the results before returning it to you.