[pytorch][zero_grad]pytorch報錯‘builtin_function_or_method‘ object has no attribute ‘named_modules‘
阿新 • • 發佈:2021-02-01
技術標籤:pytorchpytorchpythondebug神經網路
pytorch做grad-cam模型梯度歸零時呼叫了zero_grad這個函式,然後報錯
發現原因在於torch.nn.ReLU函式在eval狀態下會變成一個build_in_method的類,zero_grad的原始碼如下
def zero_grad(self, set_to_none: bool = False) -> None:
if getattr(self, '_is_replica', False):
warnings.warn(
"Calling .zero_grad() from a module created with nn.DataParallel() has no effect. "
"The parameters are copied (in a differentiable manner) from the original module. "
"This means they are not leaf nodes in autograd and so don't accumulate gradients. "
"If you need gradients in your forward method, consider using autograd.grad instead." )
for p in self.parameters():
if p.grad is not None:
if set_to_none:
p.grad = None
else:
if p.grad.grad_fn is not None:
p.grad.detach_()
else:
p. grad.requires_grad_(False)
p.grad.zero_()
可以看到主要使呼叫了module.parameters()函式得到所有權重的迭代然後一次歸0,那麼我們看一下parameters()的程式碼
def parameters(self, recurse: bool = True) -> Iterator[Parameter]:
for name, param in self.named_parameters(recurse=recurse):
yield param
這裡又呼叫了self.named_parameters()這麼一個函式得到的迭代器,我們再去看這個函式
def named_parameters(self, prefix: str = '', recurse: bool = True) -> Iterator[Tuple[str, Tensor]]:
gen = self._named_members(
lambda module: module._parameters.items(),
prefix=prefix, recurse=recurse)
for elem in gen:
yield elem
繼續找_named_members
def _named_members(self, get_members_fn, prefix='', recurse=True):
memo = set()
modules = self.named_modules(prefix=prefix) if recurse else [(prefix, self)]
for module_prefix, module in modules:
members = get_members_fn(module)
for k, v in members:
if v is None or v in memo:
continue
memo.add(v)
name = module_prefix + ('.' if module_prefix else '') + k
yield name, v
繼續找named_modules
def named_modules(self, memo: Optional[Set['Module']] = None, prefix: str = ''):
if memo is None:
memo = set()
if self not in memo:
memo.add(self)
yield prefix, self
for name, module in self._modules.items():
if module is None:
continue
submodule_prefix = prefix + ('.' if prefix else '') + name
for m in module.named_modules(memo, submodule_prefix):
yield m
好了找到了這裡是一個遞迴函式,每次找自己所有的子模組,然後返回模組,那麼當遞迴到一個build_in_method的時候就無法取_modules這個attribute了,因此產生錯誤,究竟為何RuLE在eval模式下會變成這個狀態我還不大清楚,但是模仿這個函式我們也可以寫自己的梯度歸零
def my_grad_zero(my_model):
"""@Qian2333"""
# print(my_model)
# input()
order_dict = None
try:
order_dict = my_model._modules
except AttributeError as e:
return
# print(len(order_dict))
# input()
if(len(order_dict) == 0):
try:
my_model.weight.grad = None
except AttributeError as e:
return
for name in order_dict:
my_grad_zero(order_dict[name])
然後呼叫這個函式就可以解決問題了