1. 程式人生 > >HDFS balancer 異常處理

HDFS balancer 異常處理

Hbase批量匯入資料時,伺服器負載較高,導致HDFS資料沒有及時均衡,導致有一個DataNode資料暴增,手動進行balancer。

增加HDFS DataNode節點,想要均衡資料儲存,執行

 hdfs balancer -threshold 10 

突然有一些節點報錯


18/09/21 17:51:37 WARN balancer.Dispatcher: Failed to move blk_1073837252_96442 with size=268435456 from 10.248.161.6:9866:DISK to 10.248.161.10:9866:DISK through 10.248.161.6:9866
java.net.NoRouteToHostException: 沒有到主機的路由
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:356)
        at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$3000(Dispatcher.java:233)
        at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1148)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

後來發現是新增的節點沒有關閉防火牆 。。。。。 

CentOS7 執行

service firewalld status
service firewalld stop
systemctl disable firewalld.service
service firewalld status

然後再檢視日誌發現恢復正常。

運行了十個小時才完成~~~~~~~~~~~

17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
17/12/08 19:16:28 INFO balancer.Balancer: dfs.blocksize = 268435456 (default=134217728)
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.32:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.31:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.13:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.7:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.12:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.9:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.40:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.35:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.10:9866
17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.6:9866
17/12/08 19:16:28 INFO balancer.Balancer: 0 over-utilized: []
17/12/08 19:16:28 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
17/12/08 19:16:28              311              3.05 TB                 0 B                0 B
17/12/08 19:16:28       Balancing took 9.853577777777778 hours