Pytorch學習(十)---解讀Neural Style程式碼
總說
其實之前寫過的torch版本的neural style程式碼的解讀,可以參考
Torch7學習(七)——Neural-Style程式碼解析,不過那是傳統的層的思想的框架,如今都是計算圖的思想了。pytorch版本的寫法與之前的寫法還是有一定差異的,主要是簡單了很多!對比之後你會震撼的。
pytorch官網的neural style程式碼
其他沒啥好看的,主要看核心程式碼:
class ContentLoss(nn.Module):
def __init__(self, target, weight):
super(ContentLoss, self).__init__()
# we 'detach' the target content from the tree used
self.target = target.detach() * weight
# to dynamically compute the gradient: this is a stated value,
# not a variable. Otherwise the forward method of the criterion
# will throw an error.
self.weight = weight
self.criterion = nn.MSELoss()
def forward (self, input):
self.loss = self.criterion(input * self.weight, self.target)
self.output = input
return self.output
def backward(self, retain_graph=True):
self.loss.backward(retain_graph=retain_graph)
return self.loss
class GramMatrix(nn.Module):
def forward(self, input):
a, b, c, d = input.size() # a=batch size(=1)
# b=number of feature maps
# (c,d)=dimensions of a f. map (N=c*d)
features = input.view(a * b, c * d) # resise F_XL into \hat F_XL
G = torch.mm(features, features.t()) # compute the gram product
# we 'normalize' the values of the gram matrix
# by dividing by the number of element in each feature maps.
return G.div(a * b * c * d)
class StyleLoss(nn.Module):
def __init__(self, target, weight):
super(StyleLoss, self).__init__()
self.target = target.detach() * weight
self.weight = weight
self.gram = GramMatrix()
self.criterion = nn.MSELoss()
def forward(self, input):
self.output = input.clone()
self.G = self.gram(input)
self.G.mul_(self.weight)
self.loss = self.criterion(self.G, self.target)
return self.output
def backward(self, retain_graph=True):
self.loss.backward(retain_graph=retain_graph)
return self.loss
上面部分,我們發現比較詭異的地方:
1. 在Pytorch入門學習(八)—–自定義層的實現(甚至不可導operation的backward寫法)中,我們知道如果要擴充套件自定義層,只需要過載nn.Module的forward函式就行。然後在forward裡面呼叫xxxFunction.apply來呼叫自定義autograd類,在自定義autograd類中,再進行過載forward以及backward。但是這裡為什麼在自定義module中就有backward?
2. retain_graph有什麼用?
3. Gram矩陣直接寫forward就行,根本不用寫反向傳播,這個是真厲害!
對比torch的Gram寫法
local Gram, parent = torch.class('nn.GramMatrix', 'nn.Module')
function Gram:__init()
parent.__init(self)
end
function Gram:updateOutput(input)
assert(input:dim() == 3)
local C, H, W = input:size(1), input:size(2), input:size(3)
local x_flat = input:view(C, H * W)
self.output:resize(C, C)
self.output:mm(x_flat, x_flat:t())
return self.output
end
function Gram:updateGradInput(input, gradOutput)
assert(input:dim() == 3 and input:size(1))
local C, H, W = input:size(1), input:size(2), input:size(3)
local x_flat = input:view(C, H * W)
self.gradInput:resize(C, H * W):mm(gradOutput, x_flat)
self.gradInput:addmm(gradOutput:t(), x_flat)
self.gradInput = self.gradInput:view(C, H, W)
return self.gradInput
end
-- Define an nn Module to compute style loss in-place
local StyleLoss, parent = torch.class('nn.StyleLoss', 'nn.Module')
function StyleLoss:__init(strength, normalize)
parent.__init(self)
self.normalize = normalize or false
self.strength = strength
self.target = torch.Tensor()
self.mode = 'none'
self.loss = 0
self.gram = nn.GramMatrix()
self.blend_weight = nil
self.G = nil
self.crit = nn.MSECriterion()
end
function StyleLoss:updateOutput(input)
self.G = self.gram:forward(input)
self.G:div(input:nElement())
if self.mode == 'capture' then
if self.blend_weight == nil then
self.target:resizeAs(self.G):copy(self.G)
elseif self.target:nElement() == 0 then
self.target:resizeAs(self.G):copy(self.G):mul(self.blend_weight)
else
self.target:add(self.blend_weight, self.G)
end
elseif self.mode == 'loss' then
self.loss = self.strength * self.crit:forward(self.G, self.target)
end
self.output = input
return self.output
end
function StyleLoss:updateGradInput(input, gradOutput)
if self.mode == 'loss' then
local dG = self.crit:backward(self.G, self.target)
dG:div(input:nElement())
self.gradInput = self.gram:backward(input, dG)
if self.normalize then
self.gradInput:div(torch.norm(self.gradInput, 1) + 1e-8)
end
self.gradInput:mul(self.strength)
self.gradInput:add(gradOutput)
else
self.gradInput = gradOutput
end
return self.gradInput
end
前後程式碼量起碼差一倍以上,而且書寫難度後者困難很多!最主要的差別就是torch對於自定義層,得自己寫updateGradInput啊!特別是gram的反向傳播,可能並不是很容易寫的。另外一點是StyleLoss的反向傳播self.gradInput:add(gradOutput)
也是有點理解費勁。這點其實在隱藏層加監督(feature matching)的程式碼書寫方法—- 附加optim包的功能再看。有了相應的說明。然而對於這種自動求導的框架,只要是你forward是完全用Variable進行計算的,那麼就會建立一個正確的graph,那麼反向就會正確!就是這麼crazy!根本不用寫反向傳播!
Neural Style的自定義module的backward是什麼?
其實這個“backward”知識一個普通的函式!並不是過載內部的backward!
在這個“backward”中主要是呼叫criterion的backward,然後返回這個loss,好讓外面能拿到相應的損失。
其實看到後面就可以發現。
竟然可以直接將自定義層放入一個list中!
# desired depth layers to compute style/content losses :
content_layers_default = ['conv_4']
style_layers_default = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']
def get_style_model_and_losses(cnn, style_img, content_img,
style_weight=1000, content_weight=1,
content_layers=content_layers_default,
style_layers=style_layers_default):
cnn = copy.deepcopy(cnn)
# just in order to have an iterable access to or list of content/syle
# losses(放入自定義層的list)
content_losses = []
style_losses = []
model = nn.Sequential() # the new Sequential module network
gram = GramMatrix() # we need a gram module in order to compute style targets
# move these modules to the GPU if possible:
if use_cuda:
model = model.cuda()
gram = gram.cuda()
i = 1
for layer in list(cnn):
if isinstance(layer, nn.Conv2d):
name = "conv_" + str(i)
model.add_module(name, layer)
if name in content_layers:
# add content loss:
target = model(content_img).clone()
content_loss = ContentLoss(target, content_weight)
model.add_module("content_loss_" + str(i), content_loss)
content_losses.append(content_loss)
if name in style_layers:
# add style loss:
target_feature = model(style_img).clone()
target_feature_gram = gram(target_feature)
style_loss = StyleLoss(target_feature_gram, style_weight)
model.add_module("style_loss_" + str(i), style_loss)
style_losses.append(style_loss)
if isinstance(layer, nn.ReLU):
name = "relu_" + str(i)
model.add_module(name, layer)
if name in content_layers:
# add content loss:
target = model(content_img).clone()
content_loss = ContentLoss(target, content_weight)
model.add_module("content_loss_" + str(i), content_loss)
content_losses.append(content_loss)
if name in style_layers:
# add style loss:
target_feature = model(style_img).clone()
target_feature_gram = gram(target_feature)
style_loss = StyleLoss(target_feature_gram, style_weight)
model.add_module("style_loss_" + str(i), style_loss)
style_losses.append(style_loss)
i += 1
if isinstance(layer, nn.MaxPool2d):
name = "pool_" + str(i)
model.add_module(name, layer) # ***
# 將整個模型,以及損失層的list全部返回
return model, style_losses, content_losses
再之後
# Parameter是需要grad的Variable!
# 這裡是迭代優化輸入x,因此將x作為引數,然後放進optim中。
def get_input_param_optimizer(input_img):
# this line to show that input is a parameter that requires a gradient
input_param = nn.Parameter(input_img.data)
optimizer = optim.LBFGS([input_param])
return input_param, optimizer
def run_style_transfer(cnn, content_img, style_img, input_img, num_steps=300,
style_weight=1000, content_weight=1):
"""Run the style transfer."""
print('Building the style transfer model..')
model, style_losses, content_losses = get_style_model_and_losses(cnn,
style_img, content_img, style_weight, content_weight)
input_param, optimizer = get_input_param_optimizer(input_img)
print('Optimizing..')
run = [0]
while run[0] <= num_steps:
def closure():
# correct the values of updated input image
input_param.data.clamp_(0, 1)
# 首先梯度置0
optimizer.zero_grad()
model(input_param)
style_score = 0
content_score = 0
# 這個有意思。直接呼叫假的“backward”
# 這個backward會呼叫隱藏層約束的loss的真的backward!
# 這個假的backward其實主要是返回相應的loss用的。
# 寫法非常巧妙。
for sl in style_losses:
style_score += sl.backward()
# content層與style層甚至可以分開來backward!
for cl in content_losses:
content_score += cl.backward()
run[0] += 1
if run[0] % 50 == 0:
print("run {}:".format(run))
print('Style Loss : {:4f} Content Loss: {:4f}'.format(
style_score.data[0], content_score.data[0]))
print()
return style_score + content_score
optimizer.step(closure)
# a last correction...
input_param.data.clamp_(0, 1)
return input_param.data
output = run_style_transfer(cnn, content_img, style_img, input_img)
plt.figure()
imshow(output, title='Output Image')
# sphinx_gallery_thumbnail_number = 4
plt.ioff()
plt.show()
很方便!下面程式碼展示了自定義層竟然可以分開進行backward!一點都不影響!
for sl in style_losses:
style_score += sl.backward()
for cl in content_losses:
content_score += cl.backward()
其實很簡單,backward會計算相應的網路結點的梯度,如果梯度不置0,那麼這些梯度是不斷累加的。不過pytorch為了節省空間,除了葉子結點(自己建立的Variable)之外的結點的梯度,一旦計算完,就會清空!所以你一般想看中間層的梯度是看不到的!除非新增hook!那麼我們就不能保留這些中間結點的梯度嗎?用retain_graph!
retain_graph與detach的使用
retain_graph就是用來儲存計算反向時的graph的,當retain_graph為true時,那麼就可以多次單獨backward,而不怕上一次的梯度消失!這也就是為什麼
def backward(self, retain_graph=True):
self.loss.backward(retain_graph=retain_graph)
return self.loss
另外一點是,這裡有一個detach。這是很重要的。因為這裡的target是用style圖傳入網路中得到的。它是一個Variable!有著自己的計算圖。detach()的作用就是將這個結點“截斷”,使得其變成葉子節點,就好像是我們自己建立的一個結點,這樣的話target.grad_fn
變成None,梯度傳到它就不會往前進行。
def __init__(self, target, weight):
super(StyleLoss, self).__init__()
self.target = target.detach() * weight
self.weight = weight
self.gram = GramMatrix()
self.criterion = nn.MSELoss()
總結
- 自動求導的框架不用寫反向傳播,前提是自定義module中必須全部用Variable進行操作,否則就無法建立正確的graph!
- pytorch為了節省空間,不會保留反向的圖,也就意味著中間結點的grad都無法返回,除非你設定
retain_graph
為true。這也為了“多個loss單獨backward”成為可能。 - 注意detach的使用,將計算圖截斷,使其變成葉子結點。
- 這種寫backward的方式值得借鑑。