[轉載] BitTorrent協議規範
BitTorrent 協議規範(BT協議集合)一
BitTorrent 是一種分發檔案的協議。它通過URL來識別內容,並且可以無縫的和web進行互動。它基於HTTP協議,它的優勢是:如果有多個下載者併發的下載同一個檔案,那麼,每個下載者也同時為其它下載者上傳檔案,這樣,檔案源可以支援大量的使用者進行下載,而只帶來適當的負載的增長。(譯註:因為大量的負載被均衡到整個系統中,所以提供原始檔的機器的負載只有少量增長)
一個BT檔案分佈系統由下列實體組成:
一個普通的web伺服器
一個靜態的“元資訊”檔案
一個跟蹤(tracker)伺服器
終端使用者的web瀏覽器
終端下載者
理想的情況是多個終端使用者在下載同一個檔案。
要提供檔案共享,那麼一臺主機需要執行以下步驟:
Ø執行一個 tracker伺服器(或者,已經有一個tracker伺服器在運行了也可以)
Ø執行一個web伺服器,例如apache,或者已經有一個web伺服器在運行了。
Ø在web伺服器上,將副檔名.torrent 和MIME型別 application/x-bittorrent關聯起來(或者已經關聯了)
Ø根據 tracker伺服器的 URL 和要共享的檔案來建立一個“元資訊”檔案(.torrent)。
Ø將“元資訊”檔案釋出到web伺服器上
Ø在某個web頁面上,新增一個到“元資訊”檔案的連結。
Ø執行一個已經擁有完整檔案的下載者(被成為’origin’,或者’seed’,種子)
要開始下載檔案,那麼終端使用者執行以下步驟:
Ø安裝 BT(或者已經安裝)
Ø訪問提供 .torrent 檔案的web伺服器
Ø點選到 .torrent 檔案的連結(譯註:這時候,bt會彈出一個對話方塊)
Ø選擇要把下載的檔案儲存到哪裡?或者是一次斷點續傳
Ø等待下載的完成。
Ø結束bt程式的執行(如果不主動結束,那麼bt會一直為其它人提供檔案上傳)
各個部分之間的連通性如下:
網站負責提供一個靜態的檔案,而把BT輔助程式(客戶端)放在客戶端機器上。
Trackers從所有下載者處接收資訊,並返回給它們一個隨機的peers的列表。這種互動是通過HTTP或HTTPS協議來完成的。
下載者週期性的向tracker登記,使得tracker能瞭解它們的進度;下載者之間通過直接連線進行資料的上傳和下載。這種連線使用的是 BitTorrent 對等協議,它基於TCP。
Origin只負責上傳,從不下載,因為它已經擁有了完整的檔案。Origin是必須的。
元檔案和tracker的響應都採用的是一種簡單、有效、可擴充套件的格式,被稱為bencoding,它可以包含字串和整數。由於對不需要的字典關鍵字可以忽略,所以這種格式具有可擴充套件性,其它選項以後可以方便的加進來。
Bencoding格式如下:
對於字串,首先是一個字串的長度,然後是冒號,後面跟著實際的字串,例如:4:spam,就是“ spam”
整數編碼如下,以 ‘i’ 開始,然後10進位制的整數值,最後以’e’結尾。例如,i3e表示3,I-3e表示-3。整數沒有大小限制。I-0e是無效的。除了 i0e外,所有以0起始的整數都無效。I0e當然表示0。
列表編碼如下,以’l’開始,接下來是列表值的編碼(也採用bencoded編碼),最後以’e’結束。例如:l4:spam4:eggse 表示 [‘spam’, ‘eggs’]。
字典編碼如下,以’d’開始,接下來是可選的keys和它對應的值,最戶以’e’結束。例如:d3:cow3:moo4:spam4: eggse,表示{‘cow’:’moo’,’spam’:’eggs’},而d4:spaml1:al:bee 表示 {‘spam’:[‘a’,’b’]}。鍵值必須是字串,而且已經排序(並非是按照字母順序排序,而是根據原始的字串進行排序)。
元檔案是採用bencoded編碼的字典,包括以下關鍵字:
announce tracker的伺服器
info 它實際上是一個字典,包括以下關鍵字:
Name:
一個字串,在儲存檔案的時候,作為一個建議值。僅僅是個建議而已,你可以用別的名字儲存檔案。
Piece length:
為了更好的傳輸,檔案被分隔成等長的片斷,除了最後一個片斷以外,這個值就是片斷的大小。片斷大小几乎一直都是2的冪,最常用的是 256k(BT的前一個版本3.2,用的是1M作為預設大小)
Pieces:
一個長度為20的整數倍的字串。它將再被分隔為20位元組長的字串,每個子串都是相應片斷的hash值。
此外,還有一個length或files的關鍵字,這兩個關鍵字只能出現一個。如果是length,那麼表示要下載的僅僅是單個檔案,如果是files那麼要下載的是一個目錄中的多個檔案。
如果是單個檔案,那麼length是該檔案的長度。
為了能支援其它關鍵字,對於多個檔案的情況,也把它當作一個檔案來看,也就是按照檔案出現的順序,把每個檔案的資訊連線起來,形成一個字串。每個檔案的資訊實際上也是一個字典,包括以下關鍵字:
Length:檔案長度
Path:子目錄名稱的列表,列表最後一項是檔案的實際名稱。(不允許出現列表為空的情況)。
Name:在單檔案情況下,name是檔案的名稱,而在多檔案情況下,name是目錄的名稱。
Tracker查詢。Trakcer通過HTTP的GET命令的引數來接收資訊,而響應給對方(也就是下載者)的是經過bencoded編碼的訊息。注意,儘管當前的tracker的實現需要一個web伺服器,它實際上可以執行的更輕便一些,例如,作為apache的一個模組。
Tracker GET requests have the following keys:
傳送給Tracker的GET請求,包含以下關鍵字:
Info_hash:
元檔案中info部分的sha hash,20位元組長。這個字元創幾乎肯定需要被轉義(譯註:在URL中,有些字元不能出現,必須通過unicode進行編碼)
Peer_id:
下載者的id,一個20位元組長的字串。每個下載者在開始一次新的下載之前,需要隨機建立這個id。這個字串通常也需要被轉義。
Ip:
一個可選的引數,給出了peer的ip地址(或者dns名稱?)。通常用在origin身上,如果它和tracker在同一個機器上。
Port:
peer所監聽的埠。下載者通常在在 6881 埠上監聽,如果該埠被佔用,那麼會一直嘗試到 6889,如果都被佔用,那麼就放棄監聽。
Uploaded:
已經上載的資料大小,十進位制表示。
Downloaded:
已經下載的資料大小,十進位制表示
Left:
該peer還有多少資料沒有下載完,十進位制表示。注意,這個值不能根據檔案長度和已下載資料大小計算出來,因為很可能是斷點續傳,如果因為檢查檔案完整性失敗而必須重新下載的時候,這也提供了一個機會。
Event:
一個可選的關鍵字,值是started、compted或者stopped之一(也可以為空,不做處理)。如果不出現該關鍵字,。在一次下載剛開始的時候,該值被設定為started,在下載完成之後,設定為completed。如果下載者停止了下載,那麼該值設定為stopped。
Tracker的響應是用bencoded編碼的字典。如果tracker的響應中有一個關鍵字failure reason,那麼它對應的是一個字串,用來解釋查詢失敗的原因,其它關鍵字都不再需要了。否則,它必須有兩個關鍵字:Interval:下載者在兩次傳送請求之間的時間間隔。Peers:一個字典的列表,每個字典包括以下關鍵字:Peer id,Ip,Port,分別對應peer所選擇的id、ip地址或者dns名稱、埠號。注意,如果某些事件發生,或者需要更多的peers,那麼下載者可能不定期的傳送請求,
(downloader 通過 HTTP 的GET 命令來向 tracker 傳送查詢請求,tracker 響應一個peers 的列表)
如果你想對元資訊檔案或者tracker查詢進行擴充套件,那麼需要同Bram Cohen協調,以確保所有的擴充套件都是相容的。
BT對等協議基於TCP,它很有效率,並不需要設定任何socket選項。(譯註:BT對等協議指的是peer與peer之間交換資訊的協議)
對等的兩個連線是對稱的,訊息在兩個方向上同樣的傳遞,資料也可以在任何一個方向上流動。
一旦某個peer下載完了一個片斷,並且也檢查了它的完整性,那麼它就向它所有的peers宣佈它擁有了這個片斷。
連線的任何一端都包含兩位元的狀態資訊:是否choked,是否感興趣。Choking是通知對方,沒有資料可以傳送,除非unchoking發生。Choking的原因以及技術後文解釋。
一旦一端狀態變為interested,而另一端變為非choking,那麼資料傳輸就開始了。(也就是說,一個peer,如果想從它的某個 peer那裡得到資料,那麼,它首先必須將它兩之間的連線設定為 interested,其實就是發一個訊息過去,而另一個peer,要檢查它是否應該給這個傢伙傳送資料,如果它對這個傢伙是 unchoke,那麼就可以給它發資料,否則還是不能給它資料)Interested狀態必須一直被設定――任何時候。要用點技巧才能比較好的實現這個目的,但它使得下載者能夠立刻知道哪些peers將開始下載。
對等協議由一個握手開始,後面是迴圈的訊息流,每個訊息的前面,都有一個數字來表示訊息的長度。握手的過程首先是先發送19,然後傳送“BitTorrent protocol”。19就是“BitTorrent protocol”的長度。
後續的所有的整數,都採用big-endian 來編碼為4個位元組
在協議名稱之後,是8個保留的位元組,這些位元組當前都設定為0。
接下來對元檔案中的 info 資訊,通過 sha1 計算後得到的 hash值,20個位元組長。接收訊息方,也會對 info 進行一個 hash 運算,如果這兩個結果不一樣,那麼說明對方要的檔案,並不是自己所要提供的,所以切斷連線。
接下來是20個位元組的 peer id。
這就是握手過程
接下來就是以訊息長度開始的訊息流,這是可選的。長度為0 的訊息,用於保持連線的活動狀態,被忽略。通常每隔2分鐘傳送一個這樣的訊息。
其它型別的訊息,都有一個位元組長的訊息型別,可能的值如下:
‘choke’, ‘unchoe’, ‘interested’, not interested’型別的訊息不再含有其它資料了。
‘bitfield’永遠也僅僅是第一個被髮送的訊息。它的資料實際是一個位圖,如果downloader已經發送了某個片斷,那麼對應的位置1,否則置0。Downloaders如果一個片斷也沒有,可以忽略這個訊息。(通過這個訊息,能知道什麼了?)
‘have’型別的訊息,後面的資料是一個簡單的數字,它是下載者剛剛下載完並檢查過完整性的片斷的索引。(由此,可以看到,peer通過這種訊息,很快就相互瞭解了誰都有什麼片斷)
‘request’型別的訊息,後面包含索引、開始位置和長度)長度是2的冪。當前的實現都用的是215 ,而關閉連線的時候,請求一個超過2 17的長度。(這種型別的訊息,就是當一個peer希望另一個peer給它提供片斷的時候,發出的請求)
‘cancel’型別的訊息,它的資料和’request’訊息一樣。它們通常只在下載趨向完成的時候傳送,也就是在‘結束模式“階段傳送。在一次下載接近完成的時候,最後的幾個片斷需要很長時間才能下載完。為了確保最後幾個片斷儘快下載完,它向所有的peers傳送下載請求。為了保證這不帶來可怕的低效,一旦某個片斷下載完成,它就其它peers傳送’cancel’訊息。(意思就是說,我不要這個片斷了,你要是準備好了,也不用給我發了,可以想象,如果對方還是把資料傳送過來了,那麼這邊必須忽略這些重複的資料)。
‘piece’型別的訊息,後面保護索引號、開始位置和實際的資料。注意,這種型別的訊息和 ‘request’訊息之間有潛在的聯絡(譯註:因為通常有了request訊息之後,才會響應‘piece’訊息)。如果choke和unchoke訊息傳送的過於迅速,或者,傳輸速度變的很慢,那麼可能會讀到一些並不是所期望的片斷。( 也就是說,有時候讀到了一些片斷,但這些片斷並不是所想要的)
BitTorrent 協議規範(BT協議集合)二
這個翻譯版本由孤波獨立完成
原文見http://bitconjurer.org/BitTorrent/protocol.html
作者Bram Cohen
孤波享有對該翻譯版本解釋權修改權
非商業引用請註明譯者
BitTorrent協議詳解
BitTrrent(簡稱BT,位元洪流)是一個檔案分發協議,它通過URL識別內容並且和網路無縫結合。它在HTTP平臺上的優勢在於,同時下在一個檔案的下載者在下載的同時不斷互相上傳資料,使檔案源可以在很有限的負載增加的情況下支援大量下載者同時下載。
一個BT式檔案分發需要以下實體:
·一個普通網路伺服器
·一個靜態元資訊檔案
·一個BT Tracker
·一個“原始”下載者
·網路終端瀏覽者
·網路終端下載者
這裡假設理想情況下一個檔案有多個下載者。
架設一個BT伺服器步驟如下:
1.開始執行Tracker(已執行的跳過這一步);
2.開始執行普通網路伺服器端程式,如Apache,已執行的跳過這一步;
3.在網路伺服器上將.torrent檔案關聯到Mimetype型別application/x-bittorrent(已關聯的跳過這一步);
4.用要釋出的完整檔案和Tracker的URL建立一個元資訊檔案(.torrent檔案);
5.將元資訊檔案放置在網路伺服器上;
6.在網頁上釋出元資訊檔案(.torrent檔案)連結;
7.原始下載者提供完整的檔案(原本)。
通過BT下載步驟如下:
1.安裝BT客戶端程式(已安裝的跳過這一步);
2.上網;
3.點選一個鏈到.torrent檔案的連結;
4.選擇本地儲存路徑,選定需要下載的檔案(對有選擇下載功能的BT客戶端使用者);
5.等待下載完成;
6.使用者退出下載(之前下載者不停止上傳)。
連線狀況如下:
·網站正常提供靜態檔案連線,並且啟動客戶端上的BT程式;
·Tracker即時接收所有下載者資訊,並且給每個下載者一份隨機的peer列表。通過HTTP或HTTPS協議實現;
·下載者每隔一段時間連一次Tracher,告知自己的進度,並和那些已經直接連線上的peer進行資料的上傳下載。這些連線遵循BitTorrent peer協議,通過TCP協議進行通訊。
·原始下載者只上傳不下載,他擁有整個檔案,所以很必要向網路中傳輸完檔案的所有部分。在一些人氣很旺的下載中,原始下載者經常可以在較短的時間內退出上傳,由其它已經下載到整個檔案的下載者繼續提供上傳。
元資訊檔案和Tracker的迴應資訊都以一種簡單高效可擴充套件的格式(Bencoding,B編碼)傳送。B編碼過的資訊就是以包含字串和整型資料的字典和列表的巢狀(像在Python中一樣),可擴充套件性是指可以通過減少字典忽略的關鍵值來新增新的特性。
B編碼規則如下:
·字串表示為十進位制數的既定字串長度加冒號再跟原字串。
如4:spam就相當於'spam'。
·整型資料表示成前面加'i'後面加'e'中間是十進位制數,如i3e就相當於3,i-3e就是-3。整型資料沒有長度限制。i-0e無效,所有以'i0'開頭的除了代表0的i0e,其它都無效。
·列表編碼為一個'l'開頭後面跟它所包含的專案(已經編碼過)最後加一個'e',比如l4:spam4:eggse就等於['spam', 'eggs']。
·字典編碼為一個'd'開頭後面跟一個交替關鍵值(key)及其對應值的列表最後加一個'e'。
如:d3:cow3:moo4:spam4:eggse相當於{'cow': 'moo', 'spam': 'eggs'}
d4:spaml1:a1:bee相當於{'spam': ['a', 'b']}
關鍵值必須是處理過的字串(用原始字串編碼的,而且不是數字字母混合編碼的)。
元資訊檔案就是B編碼的有以下關鍵值的字典:
announce(宣告)
Tracker的URL。
info(資訊)
此關鍵值對應一個字典包含以下描述的關鍵值:
關鍵值name對應一個字串,代表預設的下載檔案或存成目錄的名字。它是純粹建議性的。
關鍵值piece length(塊長)對應檔案分割成的塊的位元組數。出於傳輸需要,檔案被分割成大小相等的塊,除了最後一塊通常會小一些。塊長一般來說是2的權值,大部分設塊長為256K(2的18次冪)。
關鍵值pieces(塊)對應一個字串,此字串長度是20的倍數。它可以再分成每20位元組一段的多個字串,分別對應塊在索引中的SHA1校驗碼(hash)。
還有關鍵值length(長度)和files(檔案),它們不能同時出現也不能都不出現。當length出現說明這個元資訊檔案只是單檔案下載,否則說明是多檔案的目錄結構下載。
單檔案情況下,length對應檔案長度的位元組數。
多檔案情況被看作是把許多單檔案按檔案列表中的順序連成一個大檔案下載,而關鍵值files就對應檔案列表,是一個字典的列表,其中每個字典又包含以下關鍵值:
length(長度)
檔案長度的位元組數。
path(路徑)
一個包含字串的列表,字串就是子目錄名,最後一項的字串是檔名。
(一個長度為零的length表單是錯誤的。)
在單檔案情況下,關鍵值name是檔名;多檔案情況下,它就成了目錄名。
Tracker質詢是雙向的。Tracker通過HTTP GET引數獲得資訊,然後返回一個B編碼後的資訊。儘管Tracker需要在伺服器端執行,但它執行流暢像Apache的一個模組。
Tracker的GET請求有如下關鍵值:
info_hash
20位元組長的SHA1驗證碼,來自B編碼過的元資訊檔案中的info值下,是元資訊檔案的一個支鏈。這個值是自動轉換的。
peer_id
一個20位元組長的字串,是每個使用者開始下載時隨機生成的ID。這個值也是是自動轉換的。
ip
一個可選擇的引數給出peer所在的IP(或DNS主機名),一般是和Tracker同機器的原始下載者得到後以便散發檔案。
port
監聽埠,官方預設的是從6881埠開始試,如果埠被佔用則依次向後推一個埠找空閒埠,到6889埠為止。
uploaded
目前總上傳量,編碼為十進位制ASCII碼。
downloaded
目前總下載量,編碼為十進位制ASCII碼。
left
未下載的位元組數,編碼為十進位制ASCII碼。這個數不是通過檔案長度和已下載數算出來的,因為檔案可能在被續傳,還有一些已經下載的資料不能通過完整性檢查必須重新下載。
event
這是個選擇性的關鍵值,選項有started,completed或stopped(或empty,等同於沒有執行)。如果沒有執行,這個宣告會定期間隔一定時間發出。開始下載時發出started值,完成下載時發出completed。當檔案完整後再開始,沒有completed發出,下載者中止下載時發出stopped。
Tracker的迴應也是B編碼字典。如果Tracker迴應中有關鍵值failure reason(失敗原因),就會對應一個人可以讀懂的字串資訊解釋質詢失敗的原因,不需要其它關鍵值。否則,迴應必須有兩個關鍵值:interval (間隔)對應下載者定期發出請求的間隔秒數;peers,peer自選ID,IP地址或DNS主機名的字串和埠號。記住peers不會完全按照計劃的間隔傳送請求,假如他們發生一個事件或者想要更多的peers。
如果你想對元資訊檔案或者Tracker質詢進行擴充套件,請與Bram Cohen進行協調,確保所有擴充套件都相容。
BitTorrent peer協議通過TCP協議進行操作。它不用調節任何socket選項就可以流暢執行。
peer之間的連線是對稱的。兩個方向送出的資訊要協調一致,資料可以流入任一方。
peer協議指一個peer從零開始下載,每得到元資訊檔案索引中所描述的一個塊且驗證碼一致,就向所有peer宣告已得到此塊。
連線的兩個終端有2個狀態指標,被阻塞與否,被關注與否,被阻塞(choking)是表明在恢復通暢之前資料不再發出的通知。發生阻塞的原因和技術問題稍後會提到。
資料傳輸發生在一方關注對方且對方沒有阻塞的情況下。關注狀態必須一致保持-如果一個沒阻塞的peer沒有別人需要的資料,別人對他就會失去關注,轉而關注那些正在阻塞的peer。完全執行這種條件需要非常慎重,但這樣的確可以讓下載者知道哪些peer在阻塞消失後可以馬上開始下載。
連線會逐漸斷開不感興趣和阻塞的peer。
當資料傳輸時,下載者要備好多份請求排成佇列,以獲得較高的TCP傳輸效率(這叫“管運請求”)。另一方面,不能被寫入TCP緩衝區的請求要被立即排入記憶體,而不是一個應用程式級的網路緩衝,一旦阻塞出現,這些請求全部丟棄。
peer連線協議包括一次握手跟著不斷的大小一致且確定的資訊流。握手的開始是字元十九(十進位制),跟著是字串'BitTorrentprotocol'。開頭的字元是長度固定的,希望其它新協議也能這樣以便區分。
此後所有送入協議的整數都編碼為4位元組大中止端。
在現有的應用中頭部資料之後是8個全部預留為0的位元組,若果你想通過改變這8個預留位元組以擴充套件協議,請與Bram Cohen協調以保證所有擴充套件相容。
然後是來自元資訊檔案中B編碼的info值中長20位元組的SHA1驗證碼(和info_hash向Tracker宣告的值相同,但這裡是原始值那裡是引用)。如果雙方的值不同,連線斷開。一個例外是下載者想只用一個埠進行多個連線下載,它們會先從接入連線得到一個驗證碼,然後和列表裡面的對照,有相同的就答覆。
驗證碼之後是20位元組的peer id,它包含在Tracker迴應的peer列表中,在向Tracker的請求中被報告。如果接受方peer id不符合傳送方希望,連線斷開。
握手完畢。之後是長度固定的互動資訊流。零長度資訊用來保持連線,被忽略。這種資訊一般2分鐘發出一次,但是在等待資料期間很容易超時。
所有非保持連線用資訊開頭的位元組給出型別,可能值如下:
·0-阻塞
·1-通暢
·2-關注
·3-不關注
·4-有
·5-位元組
·6-請求
·7-塊
·8-取消
“阻塞”、“通暢”、“關注”和“不關注”類資訊沒有荷載。
“位元組”類資訊僅作為首資訊發出。它負載一個位元組,下載者有索引的設為1,其它為0。開始下載時沒有任何資料的下載者跳過“位元組”資訊。首位元組高位到低位對應索引0-7,依次類推,第二位元組對應8-15,等等。尾部的剩餘的位元位設為0。
“已有”類資訊負載一個數,即剛下載並核對完驗證碼的索引數。
“請求”類資訊包括包含一個索引,開始和長度。後兩者是位元組偏移。長度一般是2的權值除非被檔案尾截斷。現行一般是2的15次冪,並且關閉大於2的17次冪長度的連線。
“取消”類資訊負載和“請求”類資訊有一樣的負載。它通常在下載接近完成即“最後階段”發出。當下載快要完成時,剩下幾個塊有都從同一個執行緒下載的趨向,這樣會很慢。為了確保剩餘塊下載迅速,一旦還沒有決定剩餘塊的下載請求向誰發出,先向所有他正在從對方下載資料的連線者傳送要求所有剩餘塊的請求。為避免低效,每當一個塊開始下載就向其他peer發出取消資訊。
“塊”類資訊包含一個索引,開始和塊。記住它和“請求”類資訊是相關的。當傳輸速度很慢或“阻塞”“通暢”類資訊高頻率交替發出或兩者同時發生,可能會載到一個不需要的塊。
下載者下載塊的順序是隨機的,這樣適當防止下載者與其他Peers僅有相同的塊子集或超集。
阻塞的發生有很多原因。TCP協議的資訊擁擠控制在即時向多連線傳送資訊的過程中表現極差。同時,阻塞的存在使下載者們能夠用以牙還牙式的演算法來確保穩定的下載速率。
下面描述的阻塞演算法是目前基礎的配置。重要的是所有新演算法不光要在包含全部擴充套件演算法的網路中執行良好,也要在主要包含這個基礎演算法的網路中執行良好。
一個優秀的阻塞演算法有許多標準。它必須封鎖一定同時上傳的數量以獲得良好的TCP表現,還要避免頻繁的堵塞和通暢交替,即所謂“纖維化”。它應該用資料交換報答給自己資料的peer。最後,它還應該偶爾嘗試一下與未使用過的peer端連線,找出比現有連線好的連線,這叫做嘗試性疏通。
現行的阻塞演算法避免纖維化的手段是每10秒轉換被阻塞的名單。疏通4個自己關注且能從他們身上得到最高下載速率的peer,進行上傳和資料交換。有較高上傳速率但是不被關注下載者的peer被疏通,一旦這些peer開始被關注,那些上傳率最低的peer的就被阻塞。如果下載者有了完整的檔案,他用自己的上傳率而不是下載率來決定疏通誰的連線。
在嘗試性疏通中,任何一次中都有一個peer被疏通不管他的上傳率如何(如果被關注,他會成為4個提供下載的peer之一)。被嘗試性疏通的這種peer每30秒輪換一次。為了給它們一個上傳整一個塊的機會,新連線會以輪換中嘗試性疏通次數的3倍開始連線。
BitTorrent 協議規範(BT協議集合)三
Bittorrent udp-tracker protocol extension
Contents
introduction
connecting
Client sends packet:
Server replies with packet:
announcing
Client sends packet:
Server replies with packet:
scraping
Client sends packet:
Server replies with packet:
errors
server replies packet:
actions
extensions
authentication
credits
introduction
A tracker with the protocol "udp://" in its URI is supposed to be contacted using this protocol.
This protocol is supported by xbt-tracker.
For additional information and descritptions of the terminology used in this document, see the protocol specification
All values are sent in network byte order (big endian). The sizes are specified with ANSI-C standard types.
If no response to a request is received within 15 seconds, resend the request. If no reply has been received after 60 seconds, stop retrying.
connecting
Client sends packet:size name description
int64_t connection_id Must be initialized to 0x41727101980 in network byte order. This will identify the protocol.
int32_t action 0 for a connection request
int32_t transaction_id Randomized by client.
Server replies with packet:size name description
int32_t action Describes the type of packet, in this case it should be 0, for connect. If 3 (for error) see errors.
int32_t transaction_id Must match the transaction_id sent from the client.
int64_t connection_id A connection id, this is used when further information is exchanged with the tracker, to identify you. This connection id can be reused for multiple requests, but if it's cached for too long, it will not be valid anymore.
announcing
Client sends packet:size name description
int64_t connection_id The connection id acquired from establishing the connection.
int32_t action Action. in this case, 1 for announce. See actions.
int32_t transaction_id Randomized by client.
int8_t[20] info_hash The info-hash of the torrent you want announce yourself in.
int8_t[20] peer_id Your peer id.
int64_t downloaded The number of byte you've downloaded in this session.
int64_t left The number of bytes you have left to download until you're finished.
int64_t uploaded The number of bytes you have uploaded in this session.
int32_t event The event, one of
none = 0
completed = 1
started = 2
stopped = 3
uint32_t ip Your ip address. Set to 0 if you want the tracker to use the sender of this udp packet.
uint32_t key A unique key that is randomized by the client.
int32_t num_want The maximum number of peers you want in the reply. Use -1 for default.
uint16_t port The port you're listening on.
uint16_t extensions See extensions
Server replies with packet:size name description
int32_t action The action this is a reply to. Should in this case be 1 for announce. If 3 (for error) see errors. See actions.
int32_t transaction_id Must match the transaction_id sent in the announce request.
int32_t interval the number of seconds you should wait until reannouncing yourself.
int32_t leechers The number of peers in the swarm that has not finished downloading.
int32_t seeders The number of peers in the swarm that has finished downloading and are seeding.
The rest of the server reply is a variable number of the following structure:
size name description
int32_t ip The ip of a peer in the swarm.
uint16_t port The peer's listen port.
scraping
Client sends packet:size name description
int64_t connection_id The connection id retreived from the establishing of the connection.
int32_t action The action, in this case, 2 for scrape. See actions.
int32_t transaction_id Randomized by client.
int16_t num_info_hashes The number of info-hashes that will follow.
uint16_t extensions See extensions.
The following structure is repeated num_info_hashes times:
size name description
int8_t[20] info_hash The info hash that is to be scraped.
Server replies with packet:size name description
int32_t action The action, should in this case be 2 for scrape. If 3 (for error) see errors.
int32_t transaction_id Must match the sent transaction id.
The rest of the packet contains the following structures once for each info-hash you asked in the scrape request.
size name description
int32_t complete The total number of completed downloads.
int32_t downloaded The current number of connected seeds.
int32_t incomplete The current number of connected leechers.
errors
In case of a tracker error,
server replies packet:size name description
int32_t action The action, in this case 3, for error. See actions.
int32_t transaction_id Must match the transaction_id sent from the client.
int8_t[] error_string The rest of the packet is a string describing the error.
actions
The action fields has the following encoding:
connect = 0
announce = 1
scrape = 2
error = 3 (only in server replies)
extensions
The extensions field is a bitmask. The following bits are assigned:
1 = authentication.
authenticationThe packet will have an authentication part appended to it. It has the following format:
size name description
int8_t username_length The number of characters in the username.
int8_t[] username The username, the number of characters as specified in the previous field.
uint8_t[8] passwd_hash sha1(packet + sha1(password)) The packet in this case means the entire packet except these 8 bytes that are the password hash. These are the 8 first bytes (most significant) from the 20 bytes hash calculated.
credits
Protocol designed by Olaf van der Spek
BitTorrent 協議規範(BT協議集合)四
通常BT客戶端每幾分鐘就要向tracker傳送一次請求.對於一些比較大的BT站點,其tracker的壓力是可想而知的.降低tracker的壓力首先考慮到的當然是採用更低網路開銷的udp協議.於是Bittorrent udp-tracker protocol應運而生.
這個協議很簡單.
下面是實現它的封裝類:
// UDPTrackerClient.h: interface for the CUDPTrackerClient class.
//
//////////////////////////////////////////////////////////////////////
#if !defined(AFX_UDPTRACKERCLIENT_H__69B6ACC8_8193_4680_81D8_925B1550E92C__INCLUDED_)
#define AFX_UDPTRACKERCLIENT_H__69B6ACC8_8193_4680_81D8_925B1550E92C__INCLUDED_
#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000
#include <WINSOCK2.H>
#pragma comment(lib, "ws2_32.lib")
#ifndef _DISABLEWARNING4786_4355
#define _DISABLEWARNING4786_4355
#pragma warning( disable : 4786 )
#pragma warning( disable : 4355 )
#endif
#ifndef _ENABLEUSESTL
#define _ENABLEUSESTL
#include <LIST>
#include <SET>
#include <VECTOR>
#include <QUEUE>
#include <STRING>
#include <MAP>
using namespace std;
#endif
class CPeerHostInfo
{
public:
DWORD IP;//節點IP
WORD Port;//節點埠
};
class CUDPTrackerClient
{
public:
CUDPTrackerClient();
virtual ~CUDPTrackerClient();
void CancelSocketOperate();
BOOL Connect(const char * szServer,WORD wPort = 0);
DWORD Announcing(BYTE* pInfoHash,BYTE * pPeerID,
__int64 idownloaded,__int64 ileft,__int64 iuploaded,
int ievent,
DWORD dwIP,WORD wPort);
BOOL Disconnect();
public:
SOCKET m_socket;
DWORD m_dwIP;
WORD m_wPort;
__int64 m_iConnection_id;
DWORD m_dwConnectTick;
string m_strError; //如果請求失敗,此變數儲存錯誤資訊
DWORD m_dwDonePeers; //種子數
DWORD m_dwNumPeers; //當前下載者個數
DWORD m_dwInterval; //查詢間隔時間
list m_listPeers;
};
#endif // !defined(AFX_UDPTRACKERCLIENT_H__69B6ACC8_8193_4680_81D8_925B1550E92C__INCLUDED_)
// UDPTrackerClient.cpp: implementation of the CUDPTrackerClient class.
//
//////////////////////////////////////////////////////////////////////
#include "stdafx.h"
#include "UDPTrackerClient.h"
#include "DataStream.h"
#ifdef _DEBUG
#undef THIS_FILE
static char THIS_FILE[]=__FILE__;
#define new DEBUG_NEW
#endif
//////////////////////////////////////////////////////////////////////
// Construction/Destruction
//////////////////////////////////////////////////////////////////////
#define RECVBUFSIZE 2048
CUDPTrackerClient::CUDPTrackerClient()
{
m_socket = INVALID_SOCKET;
m_iConnection_id = 0;
m_dwConnectTick = 0;
m_dwIP = 0;
m_wPort = 0;
m_dwDonePeers = 0; //種子數
m_dwNumPeers = 0; //當前下載者個數
m_dwInterval = 0; //查詢間隔時間
}
CUDPTrackerClient::~CUDPTrackerClient()
{
Disconnect();
}
void CUDPTrackerClient::CancelSocketOperate()
{
if(m_socket != INVALID_SOCKET)
{
LINGER lingerStruct;
// If we're supposed to abort the connection, set the linger value
// on the socket to 0.
lingerStruct.l_onoff = 1;
lingerStruct.l_linger = 0;
setsockopt(m_socket, SOL_SOCKET, SO_LINGER,
(char *)&lingerStruct, sizeof(lingerStruct) );
}
}
BOOL CUDPTrackerClient::Disconnect()
{
m_iConnection_id = 0;
m_dwDonePeers = 0; //種子數
m_dwNumPeers = 0; //當前下載者個數
m_dwInterval = 0; //查詢間隔時間
if ( m_socket != INVALID_SOCKET )
{
m_dwIP = 0;
m_wPort = 0;
// Now close the socket handle. This will do an abortive or
// graceful close, as requested.
shutdown(m_socket,SD_BOTH);
closesocket(m_socket);
m_socket = INVALID_SOCKET;
return TRUE;
}
return FALSE;
}
//szServer連線的主機,可以是下列形式的字串:
//easeso.com:1000
//easeso.com
//如果wPort不為0,則szServer不應該包含埠資訊
BOOL CUDPTrackerClient::Connect(const char * szServer,WORD wPort)
{
m_strError = "";
BOOL bRes = FALSE;
if ( m_socket == INVALID_SOCKET )
{
//用UDP初始化套接字
BOOL optval = TRUE;
m_socket =socket(AF_INET,SOCK_DGRAM,0);
if(m_socket == INVALID_SOCKET)
return FALSE;
//設定超時時間
int TimeOut=10000;
int err = setsockopt (m_socket, SOL_SOCKET,SO_RCVTIMEO,(CHAR *) &TimeOut,sizeof (TimeOut));
}
if(m_dwIP == 0)
{
CString strServer = szServer;
CString strHost;
if(wPort == 0)
{
int iNext = strServer.Find(':');
if(iNext>0)
{
strHost = strServer.Mid(0,iNext);
CString strPort = strServer.Mid(iNext+1);
m_wPort = (WORD)atoi(strPort);
}
else
strHost = strServer;
}
else
{
strHost = strServer;
m_wPort = wPort;
}
if(m_wPort == 0)
m_wPort = 80;
//Check if address is an IP or a Domain Name
int a = strHost[0];
if (a > 47 && a < 58)
m_dwIP = inet_addr(strHost);
else
{
struct hostent *pHost;
pHost = gethostbyname(strHost);
if(pHost != NULL)
m_dwIP = *((ULONG*)pHost->h_addr);
else
m_dwIP = 0;
}
}
if((GetTickCount()-m_dwConnectTick)>30000)
{
m_dwConnectTick = 0;
m_iConnection_id = 0;
}
if(m_socket != INVALID_SOCKET && m_dwIP && m_wPort && m_iConnection_id ==0)
{
DWORD dwTransaction_id = GetTickCount();
SOCKADDR_IN from;
int fromlength=sizeof(SOCKADDR);
char buf[RECVBUFSIZE];
from.sin_family=AF_INET;
from.sin_addr.s_addr=m_dwIP;
from.sin_port=htons(m_wPort);
CDataStream sendstream(buf,2047);
sendstream.clear();
__int64 iCID = 0x41727101980;
sendstream.writeint64(CNetworkByteOrder::convert(iCID));
sendstream.writedword(CNetworkByteOrder::convert((int)0));
sendstream.writedword(dwTransaction_id);
int iRes = 0;
int iTimes = 6;
while(iTimes>0&&m_dwIP)
{
sendto(m_socket,sendstream.getbuffer(),sendstream.size(),0,(struct sockaddr FAR *)&from,sizeof(from));
iRes = recvfrom(m_socket,buf,RECVBUFSIZE-1,0,(struct sockaddr FAR *)&from,(int FAR *)&fromlength);
if(iRes >=0)
break;
iTimes--;
}
if(iRes>=16)
{
CDataStream recvstream(buf,RECVBUFSIZE-1);
DWORD dwAction = (DWORD)CNetworkByteOrder::convert((int)recvstream.readdword());
DWORD dwTIDResp= recvstream.readdword();
if(dwTIDResp == dwTransaction_id)
{
if(dwAction == 0)
{
m_iConnection_id = recvstream.readint64();
//BitComet將回復0x16位元組資料,最後6位元組是伺服器檢視到的本地IP和UDP埠
}
else if(dwAction == 3)//得到一個錯誤資訊包
{
buf[iRes]=0;
m_strError = recvstream.readstring();
}
}
}
}
if(m_iConnection_id)
bRes = TRUE;
return bRes;
}
//提交請求
//pInfoHash 20位元組的資料緩衝區指標
//pPeerID 20位元組的資料緩衝區指標
//ievent引數值:
//none = 0
//completed = 1
//started = 2
//stopped = 3
DWORD CUDPTrackerClient::Announcing(BYTE* pInfoHash,BYTE * pPeerID,
__int64 idownloaded,__int64 ileft,__int64 iuploaded,
int ievent,
DWORD dwIP,WORD wPort)
{
m_listPeers.clear();
m_dwNumPeers = 0;
m_dwDonePeers = 0;
m_strError = "";
DWORD dwReturnCode = 0;
if(m_iConnection_id && m_socket != INVALID_SOCKET && m_dwIP & m_wPort)
{
DWORD dwTransaction_id = GetTickCount();
//srand(dwTransaction_id);
//DWORD dwKey = rand();
DWORD dwKey = 0x3753;
SOCKADDR_IN from;
int fromlength=sizeof(SOCKADDR);
char buf[RECVBUFSIZE];
from.sin_family=AF_INET;
from.sin_addr.s_addr=m_dwIP;
from.sin_port=htons(m_wPort);
CDataStream sendstream(buf,RECVBUFSIZE-1);
sendstream.clear();
sendstream.writeint64(m_iConnection_id);
sendstream.writedword(CNetworkByteOrder::convert((int)1));
sendstream.writedword(dwTransaction_id);
sendstream.writedata(pInfoHash,20);
sendstream.writedata(pPeerID,20);
sendstream.writeint64(CNetworkByteOrder::convert(idownloaded));
sendstream.writeint64(CNetworkByteOrder::convert(ileft));
sendstream.writeint64(CNetworkByteOrder::convert(iuploaded));
sendstream.writedword(CNetworkByteOrder::convert(ievent));
sendstream.writedword(dwIP);
sendstream.writedword(CNetworkByteOrder::convert((int)dwKey));
sendstream.writedword(CNetworkByteOrder::convert((int)200));
sendstream.writedword(CNetworkByteOrder::convert(wPort));
int iRes = 0;
int iTimes = 2;
while(iTimes>0&&m_dwIP)
{
sendto(m_socket,sendstream.getbuffer(),sendstream.size(),0,(struct sockaddr FAR *)&from,sizeof(from));
iRes = recvfrom(m_socket,buf,RECVBUFSIZE-1,0,(struct sockaddr FAR *)&from,(int FAR *)&fromlength);
if(iRes >=0)
break;
iTimes--;
}
if(iRes>=20)
{
CDataStream recvstream(buf,RECVBUFSIZE-1);
DWORD dwAction = (DWORD)CNetworkByteOrder::convert((int)recvstream.readdword());
DWORD dwTIDResp= recvstream.readdword();
if(dwTIDResp == dwTransaction_id)
{
if(dwAction == 1)
{
m_dwInterval = (DWORD)CNetworkByteOrder::convert((int)recvstream.readdword());
m_dwNumPeers = (DWORD)CNetworkByteOrder::convert((int)recvstream.readdword());
m_dwDonePeers = (DWORD)CNetworkByteOrder::convert((int)recvstream.readdword());
CPeerHostInfo hi;
for(int iCurPos = 20;iCurPos+6<=iRes;iCurPos+=6)
{
hi.IP= recvstream.readdword();
hi.Port = (WORD)CNetworkByteOrder::convert((unsigned short)recvstream.readword());
m_listPeers.push_back(hi);
}
if(m_dwNumPeers>m_listPeers.size())
{
iRes = 0;
iTimes = 6;
while(iTimes>0&&m_dwIP)
{
iRes = recvfrom(m_socket,buf,RECVBUFSIZE-1,0,(struct sockaddr FAR *)&from,(int FAR *)&fromlength);
if(iRes >=0)
break;
iTimes--;
}
if(iRes>=6)
{
for(iCurPos = 0;iCurPos+6<=iRes;iCurPos+=6)
{
hi.IP= recvstream.readdword();
hi.Port = (DWORD)CNetworkByteOrder::convert((int)recvstream.readword());
m_listPeers.push_back(hi);
}
}
}
m_dwNumPeers = m_listPeers.size();
dwReturnCode = 200;
}
else if(dwAction == 3)//得到一個錯誤資訊包
{
buf[iRes]=0;
m_strError = recvstream.readstring();
dwReturnCode = 400;
}
}
}
}
//每次都要求重新連線
m_iConnection_id = 0;
return dwReturnCode;
}
// DataStream.h: interface for the CDataStream class.
//
//////////////////////////////////////////////////////////////////////
#if !defined(AFX_DATASTREAM_H__D90A2534_EA73_4BEA_8B7E_87E59A3D1D26__INCLUDED_)
#define AFX_DATASTREAM_H__D90A2534_EA73_4BEA_8B7E_87E59A3D1D26__INCLUDED_
#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000
#include
//資料流操作函式
class CDataStream
{
public :
CDataStream(char * szBuf,int isize)
{
m_isize = isize;
buffer = szBuf;
current = buffer;
}
~CDataStream()
{
}
void clear()
{
current = buffer;
current[0]=0;
}
//此函式不動態增加記憶體,一次列印的資料長度不應該超過緩衝區的三分之一,否則可能導致失敗
bool printf(const char * format,...)
{
if(current)
{
if(current - buffer > (m_isize*2)/3)
return false;
va_list argPtr ;
va_start( argPtr, format ) ;
int count = vsprintf( current, format, argPtr ) ;
va_end( argPtr );
current += count ;
return true;
}
return false;
}
//此函式拷貝字串
bool strcpy(const char * szStr)
{
if(current&&szStr)
{
int ilen = lstrlen(szStr);
if((m_isize-(current - buffer)) < (ilen +2))
return false;
memcpy(current,szStr,ilen+1);
current += ilen;
return true;
}
return false;
}
char * getcurrentpos()
{
return current;
}
void move(int ilen)//當前指標向後移動ilen
{
current += ilen;
}
void reset()
{
current = buffer;
}
BYTE readbyte()
{
current ++;
return *(current-1);
}
void writebyte(BYTE btValue)
{
*current = btValue;
current ++;
}
WORD readword()
{
current +=2;
return *((WORD*)(current-2));
}
void writeword(WORD wValue)
{
*((WORD*)current) = wValue;
current +=2;
}
DWORD readdword()
{
current +=4;
return *((DWORD*)(current-4));
}
void writedword(DWORD dwValue)
{
*((DWORD*)current) = dwValue;
current +=4;
}
__int64 readint64()
{
current +=8;
return *((__int64*)(current-8));
}
void writeint64(__int64 iValue)
{
*((__int64*)current) = iValue;
current +=8;
}
BYTE * readdata(DWORD dwLen)
{
current +=dwLen;
return (BYTE*)(current-dwLen);
}
void writedata(BYTE * pData,DWORD dwLen)
{
memcpy(current,pData,dwLen);
current +=dwLen;
}
char * readstring()
{
char * szRes = current;
int ilen = lstrlen(current);
current +=(ilen+1);
return szRes;
}
int size()
{
return (int)(current-buffer);
}
const char * getbuffer(){return buffer;}
private :
char* buffer;
char* current;
int m_isize;
};
class CNetworkByteOrder
{
public:
static unsigned short int convert(unsigned short int iValue)
{
unsigned short int iData;
((BYTE*)&iData)[0] = ((BYTE*)&iValue)[1];
((BYTE*)&iData)[1] = ((BYTE*)&iValue)[0];
return iData;
}
static int convert(int iValue)
{
int iData;
((BYTE*)&iData)[0] = ((BYTE*)&iValue)[3];
((BYTE*)&iData)[1] = ((BYTE*)&iValue)[2];
((BYTE*)&iData)[2] = ((BYTE*)&iValue)[1];
((BYTE*)&iData)[3] = ((BYTE*)&iValue)[0];
return iData;
}
static __int64 convert(__int64 iValue)
{
__int64 iData;
((BYTE*)&iData)[0] = ((BYTE*)&iValue)[7];
((BYTE*)&iData)[1] = ((BYTE*)&iValue)[6];
((BYTE*)&iData)[2] = ((BYTE*)&iValue)[5];
((BYTE*)&iData)[3] = ((BYTE*)&iValue)[4];
((BYTE*)&iData)[4] = ((BYTE*)&iValue)[3];
((BYTE*)&iData)[5] = ((BYTE*)&iValue)[2];
((BYTE*)&iData)[6] = ((BYTE*)&iValue)[1];
((BYTE*)&iData)[7] = ((BYTE*)&iValue)[0];
return iData;
}
};
#endif // !defined(AFX_DATASTREAM_H__D90A2534_EA73_4BEA_8B7E_87E59A3D1D26__INCLUDED_)
BitTorrent 協議規範(BT協議集合)五
BitTorrent is a protocol for distributing files. It identifies content by URL and is designed to integrate seamlessly with the web. Its advantage over plain HTTP is that when multiple downloads of the same file happen concurrently, the downloaders upload to each other, making it possible for the file source to support very large numbers of downloaders with only a modest increase in its load.
A BitTorrent file distribution consists of these entities:
An ordinary web server
A static 'metainfo' file
A BitTorrent tracker
An 'original' downloader
The end user web browsers
The end user downloaders
There are ideally many end users for a single file.
To start serving, a host goes through the following steps:
Start running a tracker (or, more likely, have one running already).
Start running an ordinary web server, such as apache, or have one already.
Associate the extension .torrent with mimetype application/x-bittorrent on their web server (or have done so already).
Generate a metainfo (.torrent) file using the complete file to be served and the URL of the tracker.
Put the metainfo file on the web server.
Link to the metainfo (.torrent) file from some other web page.
Start a downloader which already has the complete file (the 'origin').
To start downloading, a user does the following:
Install BitTorrent (or have done so already).
Surf the web.
Click on a link to a .torrent file.
Select where to save the file locally, or select a partial download to resume.
Wait for download to complete.
Tell downloader to exit (it keeps uploading until this happens).
The connectivity is as follows:
The web site is serving up static files as normal, but kicking off the BitTorrent helper app on the clients.
The tracker is receiving information from all downloaders and giving them random lists of peers. This is done over HTTP or HTTPS.
Downloaders are periodically checking in with the tracker to keep it informed of their progress, and are uploading to and downloading from each other via direct connections. These connections use the BitTorrent peer protocol, which operates over TCP.
The origin is uploading but not downloading at all, since it has the entire file. The origin is necessary to get the entire file into the network. Often for popular downloads the origin can be taken down after a while since several downloads may have completed and been left running indefinitely.
Metainfo file and tracker responses are both sent in a simple, efficient, and extensible format called bencoding (pronounced 'bee encoding'). Bencoded messages are nested dictionaries and lists (as in Python), which can contain strings and integers. Extensibility is supported by ignoring unexpected dictionary keys, so additional optional ones can be added later.
Bencoding is done as follows:
Strings are length-prefixed base ten followed by a colon and the string. For example 4:spam corresponds to 'spam'.
Integers are represented by an 'i' followed by the number in base 10 followed by an 'e'. For example i3e corresponds to 3 and i-3e corresponds to -3. Integers have no size limitation. i-0e is invalid. All encodings with a leading zero, such as i03e, are invalid, other than i0e, which of course corresponds to 0.
Lists are encoded as an 'l' followed by their elements (also bencoded) followed by an 'e'. For example l4:spam4:eggse corresponds to ['spam', 'eggs'].
Dictionaries are encoded as a 'd' followed by a list of alternating keys and their corresponding values followed by an 'e'. For example, d3:cow3:moo4:spam4:eggse corresponds to {'cow': 'moo', 'spam': 'eggs'} and d4:spaml1:a1:bee corresponds to {'spam': ['a', 'b']} . Keys must be strings and appear in sorted order (sorted as raw strings, not alphanumerics).
Metainfo files are bencoded dictionaries with the following keys:
announce
The URL of the tracker.
info
This maps to a dictionary, with keys described below.
The name key maps to a string which is the suggested name to save the file (or directory) as. It is purely advisory.
piece length maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated. Piece length is almost always a power of two, most commonly 218 = 256 K (BitTorrent prior to version 3.2 uses 220 = 1 M as default).
pieces maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.
There is also a key length or a key files, but not both or neither. If length is present then the download represents a single file, otherwise it represents a set of files which go in a directory structure.
In the single file case, length maps to the length of the file in bytes.
For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files maps to, and is a list of dictionaries containing the following keys:
length
The length of the file, in bytes.
path
A list of strings corresponding to subdirectory names, the last of which is the actual file name (a zero length list is an error case).
In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory.
Tracker queries are two way. The tracker receives information via HTTP GET parameters and returns a bencoded message. Note that although the current tracker implementation has its own web server, the tracker could run very nicely as, for example, an apache module.
Tracker GET requests have the following keys:
info_hash
The 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. Note that this is a substring of the metainfo file. This value will almost certainly have to be escaped.
peer_id
A string of length 20 which this downloader uses as its id. Each downloader generates its own id at random at the start of a new download. This value will also almost certainly have to be escaped.
ip
An optional parameter giving the IP (or dns name) which this peer is at. Generally used for the origin if it's on the same machine as the tracker.
port
The port number this peer is listening on. Common behavior is for a downloader to try to listen on port 6881 and if that port is taken try 6882, then 6883, etc. and give up after 6889.
uploaded
The total amount uploaded so far, encoded in base ten ascii.
downloaded
The total amount downloaded so far, encoded in base ten ascii.
left
The number of bytes this peer still has to download, encoded in base ten ascii. Note that this can't be computed from downloaded and the file length since it might be a resume, and there's a chance that some of the downloaded data failed an integrity check and had to be re-downloaded.
event
This is an optional key which maps to started, completed, or stopped (or empty, which is the same as not being present). If not present, this is one of the announcements done at regular intervals. An announcement using started is sent when a download first begins, and one using completed is sent when the download is complete. No completed is sent if the file was complete when started. Downloaders send an announcement using 'stopped' when they cease downloading.
Tracker responses are bencoded dictionaries. If a tracker response has a key failure reason, then that maps to a human readable string which explains why the query failed, and no other keys are required. Otherwise, it must have two keys: interval, which maps to the number of seconds the downloader should wait between regular rerequests, and peers. peers maps to a list of dictionaries corresponding to peers, each of which contains the keys peer id, ip, and port, which map to the peer's self-selected ID, IP address or dns name as a string, and port number, respectively. Note that downloaders may rerequest on nonscheduled times if an event happens or they need more peers.
If you want to make any extensions to metainfo files or tracker queries, please coordinate with Bram Cohen to make sure that all extensions are done compatibly.
BitTorrent's peer protocol operates over TCP. It performs efficiently without setting any socket options.
Peer connections are symmetrical. Messages sent in both directions look the same, and data can flow in either direction.
The peer protocol refers to pieces of the file by index as described in the metainfo file, starting at zero. When a peer finishes downloading a piece and checks that the hash matches, it announces that it has that piece to all of its peers.
Connections contain two bits of state on either end: choked or not, and interested or not. Choking is a notification that no data will be sent until unchoking happens. The reasoning and common techniques behind choking are explained later in this document.
Data transfer takes place whenever one side is interested and the other side is not choking. Interest state must be kept up to date at all times - whenever a downloader doesn't have something they currently would ask a peer for in unchoked, they must express lack of interest, despite being choked. Implementing this properly is tricky, but makes it possible for downloaders to know which peers will start downloading immediately if unchoked.
Connections start out choked and not interested.
When data is being transferred, downloaders should keep several piece requests queued up at once in order to get good TCP performance (this is called 'pipelining'.) On the other side, requests which can't be written out to the TCP buffer immediately should be queued up in memory rather than kept in an application-level network buffer, so they can all be thrown out when a choke happens.
The peer wire protocol consists of a handshake followed by a never-ending stream of length-prefixed messages. The handshake starts with character ninteen (decimal) followed by the string 'BitTorrent protocol'. The leading character is a length prefix, put there in the hope that other new protocols may do the same and thus be trivially distinguishable from each other.
All later integers sent in the protocol are encoded as four bytes big-endian.
After the fixed headers come eight reserved bytes, which are all zero in all current implementations. If you wish to extend the protocol using these bytes, please coordinate with Bram Cohen to make sure all extensions are done compatibly.
Next comes the 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. (This is the same value which is announced as info_hash to the tracker, only here it's raw instead of quoted here). If both sides don't send the same value, they sever the connection. The one possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same one if it's in their list.
After the download hash comes the 20-byte peer id which is reported in tracker requests and contained in peer lists in tracker responses. If the receiving side's peer id doesn't match the one the initiating side expects, it severs the connection.
That's it for handshaking, next comes an alternating stream of length prefixes and messages. Messages of length zero are keepalives, and ignored. Keepalives are generally sent once every two minutes, but note that timeouts can be done much more quickly when data is expected.
All non-keepalive messages start with a single byte which gives their type. The possible values are:
0 - choke
1 - unchoke
2 - interested
3 - not interested
4 - have
5 - bitfield
6 - request
7 - piece
8 - cancel
'choke', 'unchoke', 'interested', and 'not interested' have no payload.
'bitfield' is only ever sent as the first message. Its payload is a bitfield with each index that downloader has sent set to one and the rest set to zero. Downloaders which don't have anything yet may skip the 'bitfield' message. The first byte of the bitfield corresponds to indices 0 - 7 from high bit to low bit, respectively. The next one 8-15, etc. Spare bits at the end are set to zero.
The 'have' message's payload is a single number, the index which that downloader just completed and checked the hash of.
'request' messages contain an index, begin, and length. The last two are byte offsets. Length is generally a power of two unless it gets truncated by the end of the file. All current implementations use 215, and close connections which request an amount greater than 217.
'cancel' messages have the same payload as request messages. They are generally only sent towards the end of a download, during what's called 'endgame mode'. When a download is almost complete, there's a tendency for the last few pieces to all be downloaded off a single hosed modem line, taking a very long time. To make sure the last few pieces come in quickly, once requests for all pieces a given downloader doesn't have yet are currently pending, it sends requests for everything to everyone it's downloading from. To keep this from becoming horribly inefficient, it sends cancels to everyone else every time a piece arrives.
'piece' messages contain an index, begin, and piece. Note that they are correlated with request messages implicitly. It's possible for an unexpected piece to arrive if choke and unchoke messages are sent in quick succession and/or transfer is going very slowly.
Downloaders generally download pieces in random order, which does a reasonably good job of keeping them from having a strict subset or superset of the pieces of any of their peers.
Choking is done for several reasons. TCP congestion control behaves very poorly when sending over many connections at once. Also, choking lets each peer use a tit-for-tat-ish algorithm to ensure that they get a consistent download rate.
The choking algorithm described below is the currently deployed one. It is very important that all new algorithms work well both in a network consisting entirely of themselves and in a network consisting mostly of this one.
There are several criteria a good choking algorithm should meet. It should cap the number of simultaneous uploads for good TCP performance. It should avoid choking and unchoking quickly, known as 'fibrillation'. It should reciprocate to peers who let it download. Finally, it should try out unused connections once in a while to find out if they might be better than the currently used ones, known as optimistic unchoking.
The currently deployed choking algorithm avoids fibrillation by only changing who's choked once every ten seconds. It does reciprocation and number of uploads capping by unchoking the four peers which it has the best download rates from and are interested. Peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked. If a downloader has a complete file, it uses its upload rate rather than its download rate to decide who to unchoke.
For optimistic unchoking, at any one time there is a single peer which is unchoked regardless of it's upload rate (if interested, it counts as one of the four allowed downloaders.) Which peer is optimistically unchoked rotates every 30 seconds. To give them a decent chance of getting a complete piece to upload, new connections are three times as likely to start as the current optimistic unchoke as anywhere else in the rotation.
BitTorrent 協議規範(BT協議集合)六
BT是如何採用激勵機制來達到健壯性的
Bram Cohen
2003年5月22日
翻譯:小馬哥
日期:2004-6-1
修改:揚帆
日期:2005-5-9
概要
BitTorrent 檔案釋出系統採用針鋒相對(tit_for_tat)的方法來達到帕累託有效,與當前已知的協作技術相比,它具有更高的活力。本文將解釋BitTorrent 的用途,以及是怎樣用經濟學的方法來達到這個目標的。
1、BitTorrent 用來做什麼?
當通過HTTP協議來下載一個檔案的時候,所有的上載開銷都在主機上。而使用 BitTorrent,當多個人同時下載同一個檔案的時候,他們之間也相互為對方提供檔案的部分片斷的下載。這樣,就把上