python記憶體洩露
一、python有自動垃圾回收機制(當物件的引用計數為零時直譯器會自動釋放記憶體),出現記憶體洩露的場景一般是擴充套件庫記憶體洩露或者迴圈引用(還有一種是全域性容器裡的物件沒有刪除)
前者無需討論,後者舉例如下(Obj('B')和Obj('C')的記憶體沒有回收)(貌似迴圈引用的記憶體,Python直譯器也會自己回收(標記-清除垃圾收集機制),只是時間早晚的問題,也就是說我們在編碼中不需要耗費精力去刻意避免迴圈引用,具體的內容這兩天再細看一下(http://stackoverflow.com/questions/4484167/details-how-python-garbage-collection-works
[dongsong@localhost python_study]$ cat leak_test2.py #encoding=utf-8 class Obj: def __init__(self,name='A'): self.name = name print '%s inited' % self.name def __del__(self): print '%s deleted' % self.name if __name__ == '__main__': a = Obj('A') b = Obj('B') c = Obj('c') c.attrObj = b b.attrObj = c [dongsong@localhost python_study]$ vpython leak_test2.py A inited B inited c inited A deleted
該模組可以找到增長最快的物件、實際最多的物件,可以畫出某物件裡面所有元素的引用關係圖、某物件背後的所有引用關係圖;可以根據地址獲取物件
但是用它來找記憶體洩露還是有點大海撈針的感覺:需要自己更具增長最快、實際最多物件的日誌來確定可疑物件(一般是list/dict/tuple等common物件,這個很難排查;如果最多最快的是自定義的非常規物件則比較好確定原因)
1.show_refs() show_backrefs() show_most_common_types() show_growth()
[dongsong@localhost python_study]$ !cat cat objgraph1.py #encoding=utf-8 import objgraph if __name__ == '__main__': x = [] y = [x, [x], dict(x=x)] objgraph.show_refs([y], filename='/tmp/sample-graph.png') #把[y]裡面所有物件的引用畫出來 objgraph.show_backrefs([x], filename='/tmp/sample-backref-graph.png') #把對x物件的引用全部畫出來 #objgraph.show_most_common_types() #所有常用型別物件的統計,資料量太大,意義不大 objgraph.show_growth(limit=4) #列印從程式開始或者上次show_growth到現在增加的物件(按照增加量的大小排序) [dongsong@localhost python_study]$ !vpython vpython objgraph1.py Graph written to /tmp/tmpuSFr9A.dot (5 nodes) Image generated as /tmp/sample-graph.png Graph written to /tmp/tmpAn6niV.dot (7 nodes) Image generated as /tmp/sample-backref-graph.png tuple 3393 +3393 wrapper_descriptor 945 +945 function 830 +830 builtin_function_or_method 622 +622
sample-graph.png
sample-backref-graph.png
2.show_chain()
[dongsong@localhost python_study]$ cat objgraph2.py
#encoding=utf-8
import objgraph, inspect, random
class MyBigFatObject(object):
pass
def computate_something(_cache = {}):
_cache[42] = dict(foo=MyBigFatObject(),bar=MyBigFatObject())
x = MyBigFatObject()
if __name__ == '__main__':
objgraph.show_growth(limit=3)
computate_something()
objgraph.show_growth(limit=3)
objgraph.show_chain(
objgraph.find_backref_chain(random.choice(objgraph.by_type('MyBigFatObject')),
inspect.ismodule),
filename = '/tmp/chain.png')
#roots = objgraph.get_leaking_objects()
#print 'len(roots)=%d' % len(roots)
#objgraph.show_most_common_types(objects = roots)
#objgraph.show_refs(roots[:3], refcounts=True, filename='/tmp/roots.png')
[dongsong@localhost python_study]$ !vpython
vpython objgraph2.py
tuple 3400 +3400
wrapper_descriptor 945 +945
function 831 +831
wrapper_descriptor 956 +11
tuple 3406 +6
member_descriptor 165 +4
Graph written to /tmp/tmpklkHqC.dot (7 nodes)
Image generated as /tmp/chain.png
chain.png
三、gc模組
該模組可以確定垃圾回收期無法引用到(unreachable)和無法釋放(uncollectable)的物件,跟objgraph相比有其獨到之處
gc.collect()強制回收垃圾,返回unreachable object的數量
gc.garbage返回unreachable object中uncollectable object的列表(都是些有__del__()解構函式並且身陷引用迴圈的物件)IfDEBUG_SAVEALL is set, then all unreachable objects will be added to this list rather than freed.
warning:如果用gc.disable()把自動垃圾回收關掉了,然後又不主動gc.collect(),你會看到記憶體刷刷的被消耗....
[dongsong@bogon python_study]$ cat gc_test.py
#encoding=utf-8
import gc
class MyObj:
def __init__(self, name):
self.name = name
print "%s inited" % self.name
def __del__(self):
print "%s deleted" % self.name
if __name__ == '__main__':
gc.disable()
gc.set_debug(gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS | gc.DEBUG_SAVEALL)
a = MyObj('a')
b = MyObj('b')
c = MyObj('c')
a.attr = b
b.attr = a
a = None
b = None
c = None
if gc.isenabled():
print 'automatic collection is enabled'
else:
print 'automatic collection is disabled'
rt = gc.collect()
print "%d unreachable" % rt
garbages = gc.garbage
print "\n%d garbages:" % len(garbages)
for garbage in garbages:
if isinstance(garbage, MyObj):
print "obj-->%s name-->%s attrrMyObj-->%s" % (garbage, garbage.name, garbage.attr)
else:
print str(garbage)
[dongsong@bogon python_study]$ vpython gc_test.py
a inited
b inited
c inited
c deleted
automatic collection is disabled
gc: uncollectable <MyObj instance at 0x7f3ebd455b48>
gc: uncollectable <MyObj instance at 0x7f3ebd455b90>
gc: uncollectable <dict 0x261c4b0>
gc: uncollectable <dict 0x261bdf0>
4 unreachable
4 garbages:
obj--><__main__.MyObj instance at 0x7f3ebd455b48> name-->a attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b90>
obj--><__main__.MyObj instance at 0x7f3ebd455b90> name-->b attrrMyObj--><__main__.MyObj instance at 0x7f3ebd455b48>
{'name': 'a', 'attr': <__main__.MyObj instance at 0x7f3ebd455b90>}
{'name': 'b', 'attr': <__main__.MyObj instance at 0x7f3ebd455b48>}
四、pdb模組
命令和gdb差不錯(只是列印資料的時候不是必須加個p,而且除錯介面和操作類似python互動模式)
h(elp) 幫助
c(ontinue) 繼續
n(ext) 下一個語句
s(tep) 下一步(跟進函式內部)
b(reak) 設定斷點
l(ist) 顯示程式碼
bt 呼叫棧
回車 重複上一個命令
....
鳥人喜歡在需要除錯的地方加入pdb.set_trace()然後進入狀態....(其他還有好多方式備選)
五、django記憶體洩露
Django isn't known to leak memory. If you find your Django processes areallocating more and more memory, with no sign of releasing it, check to makesure yourDEBUG setting is set toFalse. IfDEBUGisTrue, then Django saves a copy of every SQL statement it has executed.
To fix the problem, set DEBUG toFalse.
If you need to clear the query list manually at any point in your functions,just callreset_queries(), like this:
from django import db db.reset_queries()