HBase region is not online 問題修復
阿新 • • 發佈:2018-11-22
一年多沒有搞HBase了,回想前年和營神一起戰鬥的日子,~~~。今天線上遇到下面一個問題:
hbase(main):002:0> get 'mynamespace:user_basic_info','BAC3510A922CF026500874EA3975E123'
COLUMN CELL
ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region mynamespace:user_basic_info,BA5968E36ADB91CE1EA37D44267F5865,1489326561674.0250284baa6119d676821e86cfaa29f4. is not online on *******,60020,1491385979553
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2922)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1053)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2006 )
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133 )
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
其他region 都是正常的,重啟regionserver 後依然報同樣的錯誤。
首先檢查這張表是否儲存一致性問題
hbase hbck -details table
發現的確出現了2個不一致的地方
2 inconsistencies detected.
既然不一致,咱就嘗試修復一下:
hbase hbck -repair table
這個功能要管理許可權,使用慎重!修復完了以後結果如下
Summary:
Table hbase:meta is okay.
Number of regions: 1
Deployed on: ctum2f0602005.idc.wanda-group.net,60020,1482504754412
Table idctag:user_basic_info is okay.
Number of regions: 124
0 inconsistencies detected.
Status: OK
測試一下是否修復:
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
17/04/06 11:10:15 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.0-cdh5.7.1, rUnknown, Wed Jun 1 16:27:04 PDT 2016
hbase(main):001:0> get 'mynamespace:user_basic_info','BAC3510A922CF026500874EA3975E123'
COLUMN CELL
index:chineseName_encrypt timestamp=1489324693470, value=950887757EDFFFDE26E9961E8998591A
index:city timestamp=1489324693470, value=\xE9\x95\x87\xE6\xB1\x9F
index:ffan_enrollment_time timestamp=1489324693470, value=2016-10-29 08:54:13
index:from_*** timestamp=1489324693470, value=0
index:from_child timestamp=1489324693470, value=0
index:from_** timestamp=1489324693470, value=0
index:from_** timestamp=1489324693470, value=1
index:from_*** timestamp=1489324693470, value=0
index:from_*** timestamp=1489324693470, value=0
index:from_*** timestamp=1489324693470, value=0
index:from_theme timestamp=1489324693470, value=0
index:from_travel timestamp=1489324693470, value=0
index:mobile_encrypt timestamp=1489324693470, value=BAC3510A922CF026500874EA3975E123
index:*** timestamp=1489324693470, value=\xE6\xB1\x9F\xE8\x8B\x8F
index:*** timestamp=1489324693470, value=\xE6\xB1\x9F\xE8\x8B\x8F
15 row(s) in 0.3020 seconds
測試通過
如果hbase fsck 過程提示檔案有損壞,可以使用hdfs 如下命名check region對應的檔案
hdfs fsck /hbase/data/mynamespace/tablename/0250284baa6119d676821e86cfaa29f4/index/f142db2e1d844d48858ee2d919299ca0 -locations -blocks -files
出現的這樣原因我後續會分析。