1. 程式人生 > >scapy 解析pcap文件總結

scapy 解析pcap文件總結

ons red urg version scapy blog 根據 函數繼承 源碼

參考文獻

http://blog.csdn.net/meanong/article/details/53942116

https://github.com/invernizzi/scapy-http

目錄:

  1. scapy存在的問題與解決方案

    1.1 內存泄漏問題

    1.2 解決方案

    1.3 不支持HTTP協議解析

    1.4 解決方案

  2. pcap文件解析實例

1. scapy存在的問題與解決方案

1.1 內存泄漏問題

  在使用 rdpcap() 函數讀取pcap文件裏的數據時,會發生內存泄漏問題,多讀幾個文件的話可能內存就滿了。

  具體的原因在於 scapy 庫在讀取pcap文件時,open了文件卻沒有close,而用戶又沒有辦法去close,所以會產生內存泄漏。

  rdpcap()函數源碼如下:

  def rdpcap(filename, count=-1):
        """Read a pcap file and return a packet list
        count: read only <count> packets"""
        return PcapReader(filename).read_all(count=count)

  PcapReader類的read_all()函數繼承於父類RawPcapReader,函數源碼如下:

def read_all(self,count=-1):
    """return a list of all packets in the pcap file
    """
    res=[]
    while count != 0:
        count -= 1
        p = self.read_packet()
        if p is None:
            break
        res.append(p)
    return res

  在進一步查看其__init__()函數就會發現,它只寫了打開文件的代碼卻沒有close掉。

def __init__(self, filename):
    self.filename = filename
    try:
        self.f = gzip.open(filename,"rb")
        magic = self.f.read(4)
    except IOError:
        self.f = open(filename,"rb")
        magic = self.f.read(4)
    if magic == "\xa1\xb2\xc3\xd4": #big endian
        self.endian = ">"
    elif  magic == "\xd4\xc3\xb2\xa1": #little endian
        self.endian = "<"
    else:
        raise Scapy_Exception("Not a pcap capture file (bad magic)")
    hdr = self.f.read(20)
    if len(hdr)<20:
        raise Scapy_Exception("Invalid pcap file (too short)")
    vermaj,vermin,tz,sig,snaplen,linktype = struct.unpack(self.endian+"HHIIII",hdr)

    self.linktype = linktype

  

1.2 解決方案

1.2.1 修改源碼

  相關部分的源碼都在 scapy/utils.py文件下,只需要修改 rdpcap() 函數的內容即可,修改後的結果如下:

def rdpcap(filename, count=-1):
    """Read a pcap or pcapng file and return a packet list
    count: read only <count> packets
    """
    pcap = PcapReader(filename)
    data = pcap.read_all(count=count)
    pcap.close()
    return data

  在close掉pcap這個實例的時候,裏面打開的文件也會被一同close掉。

1.2.2 不使用rdpacp()函數讀取pcap文件數據

  

pr = PcapReader(‘E:/HTTP/Code/data/group1.pcap‘)
while True:
    packege = pr.read_packet()
    if packege is None:
        break
    else:
        #TODO
pr.close()   

  這裏我們直接實例化一個PcapReader對象,然後一個一個 的讀取裏面的包數據,最後再將這個實例關掉。

1.3 不支持HTTP協議解析

  查看 scapy/layers 就會發現,scapy不支持HTTP協議,我試著分析了一下包含HTTP協議的報文,結果都被解析為RAW協議了。

  如下:

  

<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=260 id=38843 flags=DF frag=0L ttl=63 proto=tcp chksum=0xce48 src=172.16.113.84 dst=206.161.232.233 options=[] |<TCP  sport=13002 dport=http seq=138591001 ack=2983383564L dataofs=5L reserved=0L flags=PA window=32120 chksum=0xa1c5 urgptr=0 options=[] |<Raw  load=‘GET /autoplus/autoplus.gif HTTP/1.0\r\nReferer: http://www.bostonian.com/\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)\r\nHost: www.bostonian.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\n\r\n‘ |>>>>

  

1.4 解決方案

  有人對scapy進行了補充,項目地址:https://github.com/invernizzi/scapy-http

  只需要使用pip安裝即可:

  

pip install scapy-http

  然後再程序裏導入http層即可解析http協議

import scapy.all as scapy
from scapy.layers import http

  

2. pcap文件解析實例

# -*- coding:utf-8 -*-
import scapy.all as scapy
from scapy.layers import http

# 提取出pacp文件中的所有包
packeges = scapy.rdpcap(‘E:/HTTP/Code/data/group5.pcap‘)
print packeges

  這裏我們直接打印 packeges ,輸出的是所有包的類型信息,如下

<group5.pcap: TCP:1323 UDP:0 ICMP:0 Other:0>

  

for p in packages:
    print repr(p)

  這裏就可以打印出每一個包的詳細信息,如下:

<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=290 id=42742 flags=DF frag=0L ttl=63 proto=tcp chksum=0xb263 src=172.16.113.84 dst=167.8.29.15 options=[] |<TCP  sport=31009 dport=http seq=1946692237 ack=672469499 dataofs=5L reserved=0L flags=PA window=32120 chksum=0xf91c urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/leadpage/credit/credrib.gif‘ Http-Version=u‘HTTP/1.0‘ Host=u‘www.usatoday.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Referer=u‘http://www.usatoday.com/leadpage/credit/credit.htm‘ Headers=u‘Host: www.usatoday.com\r\nReferer: http://www.usatoday.com/leadpage/credit/credit.htm\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=281 id=42838 flags=DF frag=0L ttl=63 proto=tcp chksum=0xb20c src=172.16.113.84 dst=167.8.29.15 options=[] |<TCP  sport=31011 dport=http seq=4290014843L ack=3755501580L dataofs=5L reserved=0L flags=PA window=32120 chksum=0xe139 urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/feedback/buyus.htm‘ Http-Version=u‘HTTP/1.0‘ Host=u‘www.usatoday.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Referer=u‘http://www.usatoday.com/leadpage/credit/credit.htm‘ Headers=u‘Host: www.usatoday.com\r\nReferer: http://www.usatoday.com/leadpage/credit/credit.htm\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=273 id=42847 flags=DF frag=0L ttl=63 proto=tcp chksum=0xb20b src=172.16.113.84 dst=167.8.29.15 options=[] |<TCP  sport=31018 dport=http seq=1370418497 ack=2248894642L dataofs=5L reserved=0L flags=PA window=32120 chksum=0xf812 urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/inetart/scribe.gif‘ Http-Version=u‘HTTP/1.0‘ Host=u‘www.usatoday.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Referer=u‘http://www.usatoday.com/feedback/buyus.htm‘ Headers=u‘Host: www.usatoday.com\r\nReferer: http://www.usatoday.com/feedback/buyus.htm\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>

  這裏每個包 p 的結構如下:

    物理層 --> 網絡層 --> 傳輸層 --> 應用層

  每一層的數據都可以根據相應層的協議名獲取,然後再通過字段名獲取具體層具體字段的信息,代碼如下:

for p in packages:
    print repr(p)
    print p[‘Ether‘].name
    print p[‘Ether‘].dst
    print p[‘Ether‘].src

    print p[‘IP‘].name
    print p[‘IP‘].dst
    print p[‘IP‘].src

    print p[‘TCP‘].name
    print p[‘TCP‘].sport
    print p[‘TCP‘].dport

  輸出結果如下:

<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=204 id=7838 flags=DF frag=0L ttl=63 proto=tcp chksum=0x7ceb src=172.16.113.84 dst=207.46.179.15 options=[] |<TCP  sport=1751 dport=http seq=2094829820 ack=3501118724L dataofs=5L reserved=0L flags=PA window=32120 chksum=0x91d6 urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/‘ Http-Version=u‘HTTP/1.0‘ Host=u‘home.microsoft.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Headers=u‘Host: home.microsoft.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
Ethernet
00:60:97:de:54:36
00:00:0c:04:41:bc
IP
207.46.179.15
172.16.113.84
TCP
1751
80
<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=256 id=7878 flags=DF frag=0L ttl=63 proto=tcp chksum=0x7f6b src=172.16.113.84 dst=207.46.176.51 options=[] |<TCP  sport=1814 dport=http seq=1495749711 ack=1496378949 dataofs=5L reserved=0L flags=PA window=32120 chksum=0xfd2b urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/mps_id_sharing/redirect.asp?home.microsoft.com/Default.asp‘ Http-Version=u‘HTTP/1.0‘ Host=u‘msid.msn.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Headers=u‘Host: msid.msn.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
Ethernet
00:60:97:de:54:36
00:00:0c:04:41:bc
IP
207.46.176.51
172.16.113.84
TCP
1814
80
<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=256 id=7888 flags=DF frag=0L ttl=63 proto=tcp chksum=0x7c85 src=172.16.113.84 dst=207.46.179.15 options=[] |<TCP  sport=1876 dport=http seq=3929480773L ack=300743146 dataofs=5L reserved=0L flags=PA window=32120 chksum=0x7f03 urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/Default.asp?newguid=1e5bf4633b9f11d2a26600805fb7e334‘ Http-Version=u‘HTTP/1.0‘ Host=u‘home.microsoft.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Headers=u‘Host: home.microsoft.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
Ethernet
00:60:97:de:54:36
00:00:0c:04:41:bc
IP
207.46.179.15
172.16.113.84
TCP
1876
80

  

  這裏也有個比較重要的屬性 payload ,可以獲取上一層協議的數據,比如:

for p in packages:
    print repr(p)

    print p.name
    print p.payload.name
    print p.payload.payload.name

  這裏就可以把前三層所使用的協議名打出來:

<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=204 id=7838 flags=DF frag=0L ttl=63 proto=tcp chksum=0x7ceb src=172.16.113.84 dst=207.46.179.15 options=[] |<TCP  sport=1751 dport=http seq=2094829820 ack=3501118724L dataofs=5L reserved=0L flags=PA window=32120 chksum=0x91d6 urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/‘ Http-Version=u‘HTTP/1.0‘ Host=u‘home.microsoft.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Headers=u‘Host: home.microsoft.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
Ethernet
IP
TCP
<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=256 id=7878 flags=DF frag=0L ttl=63 proto=tcp chksum=0x7f6b src=172.16.113.84 dst=207.46.176.51 options=[] |<TCP  sport=1814 dport=http seq=1495749711 ack=1496378949 dataofs=5L reserved=0L flags=PA window=32120 chksum=0xfd2b urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/mps_id_sharing/redirect.asp?home.microsoft.com/Default.asp‘ Http-Version=u‘HTTP/1.0‘ Host=u‘msid.msn.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Headers=u‘Host: msid.msn.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
Ethernet
IP
TCP
<Ether  dst=00:60:97:de:54:36 src=00:00:0c:04:41:bc type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=256 id=7888 flags=DF frag=0L ttl=63 proto=tcp chksum=0x7c85 src=172.16.113.84 dst=207.46.179.15 options=[] |<TCP  sport=1876 dport=http seq=3929480773L ack=300743146 dataofs=5L reserved=0L flags=PA window=32120 chksum=0x7f03 urgptr=0 options=[] |<HTTP  |<HTTPRequest  Method=u‘GET‘ Path=u‘/Default.asp?newguid=1e5bf4633b9f11d2a26600805fb7e334‘ Http-Version=u‘HTTP/1.0‘ Host=u‘home.microsoft.com‘ User-Agent=u‘Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ Accept=u‘image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*‘ Headers=u‘Host: home.microsoft.com\r\nAccept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*\r\nUser-Agent: Mozilla/3.01 (X11; I; SunOS 4.1.4 sun4u)‘ |>>>>>
Ethernet
IP
TCP

  

scapy 解析pcap文件總結