1. 程式人生 > >hadoop&hbase壞道檢查和處理之東湖現場

hadoop&hbase壞道檢查和處理之東湖現場

今天遇到一個問題,hbase客戶端寫入hbase報錯如下:

hbase 後臺報錯ERROR: Region { meta => ***, hdfs => hdfs://***, deployed =>  } not deployed on any region server.

2016-01-20 15:52:31,079  AsyncProcess$AsyncRequestFutureImpl.resubmit:1144   INFO  #14508512, table=tr_image, attempt=26/35 failed=1ops, last exception: org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region tr_image,A21ML90210111\x00\x00\x01Q,1451765854574.21820d2ed2a501a99300f2c74367d954. <span style="background-color: rgb(255, 0, 0);">is not online on host110,16020,1453007077717</span>
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2740)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:859)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:1795)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31313)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
        at java.lang.Thread.run(Thread.java:745)
 on host110,16020,1453007077717, tracking started null, retrying after=20022ms, replay=1ops

上網找找問題發現可能是meta(hbase元資料)資訊有錯誤,好吧,我們 使用命令檢視一下hbase的狀態 命令為"hbase hbck",輸入關鍵內容如下:
598d0b620b41, negotiated timeout = 40000
2016-01-20 15:54:02,964 INFO  [main] zookeeper.ZooKeeper: Session: 0x152598d0b620b41 closed
2016-01-20 15:54:02,965 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2016-01-20 15:54:02,965 INFO  [main] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x152598d0b620b40
2016-01-20 15:54:02,967 INFO  [main] zookeeper.ZooKeeper: Session: 0x152598d0b620b40 closed
2016-01-20 15:54:02,967 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
<span style="background-color: rgb(255, 0, 0);">ERROR: Region { meta => tr_image,A21ML90210111\x00\x00\x01Q,1451765854574.21820d2ed2a501a99300f2c74367d954., hdfs => hdfs://cluster1/hbase/data/default/tr_image/21820d2ed2a501a99300f2c74367d954, deployed =>  } not deployed on any region server.
ERROR: Region { meta => tr_image,AQ9E560210571\x00\x00\x01Q4,1452975417206.2ec3471d3f10eed3087842233b5ec5a1., hdfs => hdfs://cluster1/hbase/data/default/tr_image/2ec3471d3f10eed3087842233b5ec5a1, deployed =>  } not deployed on any region server.
ERROR: Region { meta => tr_image,AX1G770210431\x00\x00\x01Q$\x9C,1451991127089.91a2e0eac438482edb75685a9f5d3efa., hdfs => hdfs://cluster1/hbase/data/default/tr_image/91a2e0eac438482edb75685a9f5d3efa, deployed =>  } not deployed on any region server.</span>
2016-01-20 15:54:03,158 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel. set hbasefsck.overlap.merge.parallel to false to run serially.
2016-01-20 15:54:03,159 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel. set hbasefsck.overlap.merge.parallel to false to run serially.
2016-01-20 15:54:03,159 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel. set hbasefsck.overlap.merge.parallel to false to run serially.
2016-01-20 15:54:03,159 INFO  [main] util.HBaseFsck: Handling overlap merges in parallel. set hbasefsck.overlap.merge.parallel to false to run serially.
ERROR: There is a hole in the region chain between A21ML90210111\x00\x00\x01Q and A21RL60210181\x00\x00\x01QG.  You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: There is a hole in the region chain between AQ9E560210571\x00\x00\x01Q4 and AQ9T950210401\x00\x00\x01P\xB6.  You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: There is a hole in the region chain between AX1G770210431\x00\x00\x01Q$\x9C and AX1H320210571\x00\x00\x01P\xB4.  You need to create a new .regioninfo and region dir in hdfs to plug the hole.
看日誌因該是 meta中記錄的regsion在server中找不到了。

百度一下吧,查到一個文章入連線:

那我就試一試唄,執行了一個命令恢復meta “hbase hbck -fixMeta -fixAssignments

命令返回資訊看見個這

util.HBaseFsck: Sleeping 10000ms before re-checking after fix...

大笑大笑大笑要成功嗎???

再看使用hbase客戶端報錯~~消失了。