hadoop故障一例
阿新 • • 發佈:2019-02-05
2014-07-21 10:12:31,098 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node node-xxx-40:50010 is attempting to report storage ID DS-1137532894-192.168.2.40-50010-1400206530880. Node 192.168.2.40:50010 is expected to serve this storage.
彷彿是ip變更引發的問題。仔細一問,有同事手工做過伺服器內部檔案的複製,估計複製有問題。
只有按經典辦法,刪除相應目錄,重啟datanode.
處理完畢.重啟datanode,又不行。
2014-07-21 13:46:43,108 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc. RemoteException: java.io.IOException: verifyNodeRegistration: unknown datanode node-114-40:50010 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyNodeRegistration(FSNamesystem.ja va:4743) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:253 8) at org.apache.hadoop.hdfs.server.namenode.NameNode.register(NameNode.java:1013) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at com.sun.proxy.$Proxy5.register(Unknown Source) at org.apache.hadoop.hdfs.server.datanode.DataNode.register(DataNode.java:740) at org.apache.hadoop.hdfs.server.datanode.DataNode.runDatanodeDaemon(DataNode.java:1549) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1609) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1734) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1751)
到namenode上,在/etc/hadoop/exclude裡去掉該節點。
然後執行,sudo -u hdfs hadoop dfsadmin -refreshNodes 。