HDFS balancer 異常處理
阿新 • • 發佈:2018-11-21
Hbase批量匯入資料時,伺服器負載較高,導致HDFS資料沒有及時均衡,導致有一個DataNode資料暴增,手動進行balancer。
增加HDFS DataNode節點,想要均衡資料儲存,執行
hdfs balancer -threshold 10
突然有一些節點報錯
18/09/21 17:51:37 WARN balancer.Dispatcher: Failed to move blk_1073837252_96442 with size=268435456 from 10.248.161.6:9866:DISK to 10.248.161.10:9866:DISK through 10.248.161.6:9866 java.net.NoRouteToHostException: 沒有到主機的路由 at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:356) at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$3000(Dispatcher.java:233) at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:1148) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
後來發現是新增的節點沒有關閉防火牆 。。。。。
CentOS7 執行
service firewalld status
service firewalld stop
systemctl disable firewalld.service
service firewalld status
然後再檢視日誌發現恢復正常。
運行了十個小時才完成~~~~~~~~~~~
17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 50 (default=50) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240) 17/12/08 19:16:28 INFO balancer.Balancer: dfs.blocksize = 268435456 (default=134217728) 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.32:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.31:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.13:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.7:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.12:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.9:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.40:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.35:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.10:9866 17/12/08 19:16:28 INFO net.NetworkTopology: Adding a new node: /default/10.248.161.6:9866 17/12/08 19:16:28 INFO balancer.Balancer: 0 over-utilized: [] 17/12/08 19:16:28 INFO balancer.Balancer: 0 underutilized: [] The cluster is balanced. Exiting... 17/12/08 19:16:28 311 3.05 TB 0 B 0 B 17/12/08 19:16:28 Balancing took 9.853577777777778 hours