1. 程式人生 > >以太坊geth節點同步親測經歷

以太坊geth節點同步親測經歷

看到技術討論群中好多朋友在以太坊節點資料同步的過程中遇到很多疑問,於是親自購買一天伺服器,進行節點同步測試,本文將分享整個測試過程中遇到的問題,及解決方案。

伺服器配置

伺服器配置比較簡單,在阿里雲上購買的2核4GLinux伺服器,作業系統為centos 7.4,另外掛載了一個500G的高速雲盤。

如果大家條件允許,可將伺服器配置進行升級,比如4核8G,8核16G等,如果配置過低會遇到後面提到的一些問題。

節點啟動

安裝官網提供引數正常啟動節點,其中cache引數值配置為512,大家可根據自己的伺服器情況適當擴大,有助於節點資料的同步。

資料同步

此步驟也是最容易出現問題的地方。針對此步驟的問題詳細介紹一下。

異常一

goroutine 10855 [IO wait]:
internal/poll.runtime_pollWait(0x7f4a6599ebb0, 0x72, 0x0)
    /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc43863a198, 0x72, 0xffffffffffffff00, 0x184e740, 0x18475a0)
    /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/internal/poll/fd_poll_runtime.go
:85 +0xae internal/poll.(*pollDesc).waitRead(0xc43863a198, 0xc462457a00, 0x20, 0x20) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/internal/poll/fd_poll_runtime.go:90 +0x3d internal/poll.(*FD).Read(0xc43863a180, 0xc462457a40, 0x20, 0x20, 0x0, 0x0, 0x0) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/internal/poll/fd_unix.go
:126 +0x18a net.(*netFD).Read(0xc43863a180, 0xc462457a40, 0x20, 0x20, 0x0, 0xc42158dcc0, 0x302b35d6a3a0) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/net/fd_unix.go:202 +0x52 net.(*conn).Read(0xc421aac000, 0xc462457a40, 0x20, 0x20, 0x0, 0x0, 0x0) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/net/net.go:176 +0x6d io.ReadAtLeast(0x7f4a603b02f8, 0xc421aac000, 0xc462457a40, 0x20, 0x20, 0x20, 0xf1e900, 0x464600, 0x7f4a603b02f8) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/io/io.go:309 +0x86 io.ReadFull(0x7f4a603b02f8, 0xc421aac000, 0xc462457a40, 0x20, 0x20, 0x20, 0x0, 0x6fc23a9b4) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/io/io.go:327 +0x58 github.com/ethereum/go-ethereum/p2p.(*rlpxFrameRW).ReadMsg(0xc43432b650, 0xbe9568cbea77ec48, 0x5186ea4942, 0x19d4c80, 0x0, 0x0, 0x19d4c80, 0x28, 0x11, 0x0) /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/rlpx.go:650 +0x100 github.com/ethereum/go-ethereum/p2p.(*rlpx).ReadMsg(0xc440545da0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/rlpx.go:95 +0x148 github.com/ethereum/go-ethereum/p2p.(*Peer).readLoop(0xc4315cc660, 0xc4315cd0e0) /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/peer.go:251 +0xad created by github.com/ethereum/go-ethereum/p2p.(*Peer).run /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/peer.go:189 +0xf2 goroutine 14632 [select]: net.(*netFD).connect.func2(0x18583c0, 0xc42f87c8a0, 0xc488fcaf00, 0xc4caba3da0, 0xc4caba3d40) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/net/fd_unix.go:129 +0xf2 created by net.(*netFD).connect /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/net/fd_unix.go:128 +0x2a3 goroutine 7089 [select]: github.com/ethereum/go-ethereum/p2p.(*Peer).run(0xc427e1af60, 0xd80820, 0xc44e84bd80, 0x0) /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/peer.go:199 +0x2fe github.com/ethereum/go-ethereum/p2p.(*Server).runPeer(0xc4201e2fc0, 0xc427e1af60) /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/server.go:790 +0x122 created by github.com/ethereum/go-ethereum/p2p.(*Server).run /home/travis/gopath/src/github.com/ethereum/go-ethereum/p2p/server.go:570 +0x139c goroutine 14620 [select]: net.(*netFD).connect.func2(0x18583c0, 0xc47df23560, 0xc4289ccc80, 0xc4551937a0, 0xc455193740) /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/net/fd_unix.go:129 +0xf2 created by net.(*netFD).connect /home/travis/.gimme/versions/go1.9.2.linux.amd64/src/net/fd_unix.go:128 +0x2a3

程式同步丟擲此異常,一般情況下為記憶體或IO不足導致程式掛掉。一般情況下重啟程式即可。

異常二

WARN [02-03|12:54:57] Synchronisation failed, dropping peer    peer=3616e2d0bcacf32f err="retrieved hash chain is invalid"
WARN [02-03|12:56:02] Ancestor below allowance                 peer=64e4dd3f53e5c01e number=4843643 hash=000000000000 allowance=4843643
WARN [02-03|12:56:02] Synchronisation failed, dropping peer    peer=64e4dd3f53e5c01e err="retrieved ancestor is invalid"

// 和以下異常

WARN [02-03|12:58:55] Synchronisation failed, dropping peer    peer=dbf24adb86cfa3e6 err="no peers available or all tried for download"
WARN [02-03|12:59:23] Synchronisation failed, retrying         err="receipt download canceled (requested)"
WARN [02-03|13:00:17] Synchronisation failed, retrying         err="peer is unknown or unhealthy"
WARN [02-03|13:03:06] Synchronisation failed, retrying         err="block download canceled (requested)"
WARN [02-03|13:03:07] Synchronisation failed, retrying         err="peer is unknown or unhealthy"

日誌一致卡在此處,說明geth沒有連結到其他有效的節點,通過cosole後臺執行以下命令可看到連結的節點數為0:

> net.peerCount
0

針對此警告等待即可,如果長時間無響應,建議重新啟動節點,讓節點重新尋找新的peers。同時也可以手動新增peer。星火計劃提供的節點如下列表,可嘗試新增:

[
    "enode://6427b7e7446bb05f22fe7ce9ea175ec05858953d75a5a6e4f99a6aec0779a8bd6[email protected]121.201.14.181:30303",
    "enode://91922b12115c067005c574844c6bbdb114eb262f90b6355cec89e13b483c3e466[email protected]120.27.164.92:13333",
    "enode://3dde41a994b3b99f938f75ddf6d48318c78ddd869c70b48d00b922190bb434fc5[email protected]207.226.141.212:30303",
    "enode://7ab8fa90b204f2146c00939b8474549c544caa3598a0894fa639a5cdbd992cbc6[email protected]121.201.24.236:30303",
    "enode://db81152a8296089b04a21ad9bf347df3ff0450ffc8215d9f50c400ccf8d189631[email protected]139.198.1.244:30303",
    "enode://68dd1360f0a4ac362b41124692e31652ffe26f6f06a284ca11f3b514b3968594a[email protected]113.106.85.172:30303",
  "enode://58f6b6908286cefe43c166cfc4fed033c750caa1bc3f6e1e1e1507752c0b91248[email protected]45.113.71.186:30303",
  "enode://87190a01c02cafb97e7f49672b4c3be2937cf79c3969e0b8e7b35cac28cebfbda[email protected]119.29.207.90:30303",
  "enode://d1fdd05a62fd9544eeb455e4f4d4bd8bb574138d82d8f909f3041d0792e3401f8[email protected]120.26.129.121:30303",
  "enode://a1e9cf99eca94590ae776c8dd5c6c043a8c1f0375e9e391c9fb55133385bf453a[email protected]182.254.209.254:30303",
  "enode://562796b19d43d79dfb6160abd2d7bb78a2f2efd9501a0a767c00677e0fb3a4407[email protected]121.40.199.54:30303",
  "enode://fa2c17dcc83a6e2825668210abf7480452de4b13d8bdea8f301c3b603701918bc[email protected]120.26.124.58:30303",
  "enode://0b331b27e2976d797aed1d1464ac483a7f262860334cb5737a01a0188da08d792[email protected]47.89.49.61:30303",
  "enode://0639f20fdb5af1fecd2f2bc0ddb648885483a5945686530e6b046678635d3435d[email protected]118.192.161.147:30303",
  "enode://fd2a5d30e4f3917ee640876cc57d72a8bf5ecf049e9106c95e60cf306dd7a5dd6[email protected]121.201.29.82:30303",
  "enode://0d1b9eed7afe2d5878d5d8a4c2066b600a3bcac2e5730586421af224e93a58cd0[email protected]209.9.106.245:30303",
  "enode://ca087a651571d04953187753af969f7deb1582af2a06a3048b90adb3f87d4c419[email protected]182.150.37.23:30303",
 "enode://9b53b9d41d964f71db60d2198cfa9013fc7808d707c5e0a32da1e22d3cacd6adb[email protected]182.150.37.24:30303"
]

異常三

geth莫名其妙自動關閉,日誌未呈現異常。此問題之前的文章也提到過,因為伺服器記憶體不足觸發Linux的OOM killer操作,被殺掉了。此問題除了升級記憶體,沒有太好的辦法,只能頻繁的監控程式,發現問題重啟即可。

其中折中的辦法是設定swap,但是設定swap會大幅降低同步速度。

異常四

瘋狂列印類似以下的日誌:

INFO [02-03|13:07:24] Imported new state entries               count=1142 elapsed=5.888ms   processed=84671 pending=1907  retry=0   duplicate=0 unexpected=170

長時間列印以上日誌,區塊同步高度未變化,在這個日誌中沒有其他操作日誌。如果時間長達幾個小時,那麼趁早放棄吧,此問題是因為基礎設施比如網路、硬碟等原因導致的,短則幾天、長則幾周,都不好說。

這種問題即使重啟伺服器還會重新進入這個步驟,就不浪費精力和時間了。好多朋友遇到的都是這個問題,特別是window系統下啟動,有的卡到百分之九十九,一直同步不完,基本上都是在執行上面的操作。

親身經歷

昨天晚上6點部署好伺服器開始節點同步,剛開始由於交易較少同步速度很快。早上起床發現凌晨2點多節點卡死,一直沒同步。七點多重啟重新同步,這中間經歷了多次掛掉,多次程式異常,多次oom killer。

當節點資料同步到距離最新高度200塊左右的時候一直載入結構體,是一個比較漫長的階段,大家就耐心等待了,這期間最好不要重啟。

更多資訊

**獲取更多資訊,請關注微信公眾號:程式新視界。或加入QQ技術交流群:659809063。
本人誠接以太坊相關研發及技術支援,如有需要請聯絡QQ:541075754。非誠勿擾。**
這裡寫圖片描述

獲得一對一技術諮詢請掃碼加入知識星球(小密圈)
這裡寫圖片描述