遷移hbase的快照到新集群后RegionServer無法啟動,報錯failed open of region
錯誤日誌摘要:
2018-03-12 17:05:29,608 ERROR [RS_OPEN_REGION-our_ambari_clustergn-a05044c6-core-1-003:16020-15] handler.OpenRegionHandler: Failed open of region=market:KYLIN_YEDCQ82BF3,16F87C792E626990D57DDABF161A3B4E,1519847061679.736e53b17b220aed9aa9233ddffd952a., starting to roll back the global memstore size.
【背景】老叢集和新叢集使用的Hbase版本都是1.1.2;老叢集的hadoop是2.7.1,新叢集的hadoop版本是2.7.3
其他的表都是通過hadoop distcp過來的,遷過來後修復一下元資料、把表資料分配到有關的regionServer就OK了。
解決辦法——
【1】刪除新叢集上zookeeper上有關該表的節點,【2】清除新叢集hdfs上和該表有關的資料,【3】重啟新叢集上的所有RegionServer
【1】刪除新叢集上zookeeper上有關該表的節點
[zk: localhost:2181(CONNECTED) 2] ls /hbase/table[ksai:usertb, hbase:meta, ksai:wps_pc_active_user_domain_info, KYLIN_YEDCQ82BF3, hbase:namespace, ksai:weekly-installed-android-apps, ksai:wps_android_active_user_domain_info, ksai:test_zz]
[zk: localhost:2181(CONNECTED) 4] get /hbase/table/KYLIN_YEDCQ82BF3
�master:16000ڐ����APBUF
cZxid = 0x3000b1907
ctime = Fri Mar 16 11:31:41 CST 2018
mZxid = 0x3000b7432
mtime = Fri Mar 16 15:17:38 CST 2018
pZxid = 0x3000b1907
cversion = 0
dataVersion = 12
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 31
numChildren = 0
[zk: localhost:2181(CONNECTED) 5] rmr /hbase/table/KYLIN_YEDCQ82BF3
[zk: localhost:2181(CONNECTED) 6] ls /hbase/table/KYLIN_YEDCQ82BF3
Node does not exist: /hbase/table/KYLIN_YEDCQ82BF3
【2】清除新叢集hdfs上和該表有關的資料
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -ls -R /apps/hbase/ | grep --color KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:09 /apps/hbase/data/.hbase-snapshot/KYLIN_YEDCQ82BF3_snapshot_20180302
-rw-r--r-- 3 hbase hdfs 65 2018-03-12 15:09 /apps/hbase/data/.hbase-snapshot/KYLIN_YEDCQ82BF3_snapshot_20180302/.snapshotinfo
-rw-r--r-- 3 hbase hdfs 863 2018-03-12 15:09 /apps/hbase/data/.hbase-snapshot/KYLIN_YEDCQ82BF3_snapshot_20180302/data.manifest
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432
-rw-r--r-- 3 hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432/ea63ee219b00bf26b8ecdefcf244738f.KYLIN_YEDCQ82BF3
-rw-r--r-- 3 hbase hdfs 20061969 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/9cdedadf1d1e4d36ac92b8f2b7a79432
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc
-rw-r--r-- 3 hbase hdfs 767 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc/.tableinfo.0000000005
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tmp
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
-rw-r--r-- 3 hbase hdfs 51 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/.regioninfo
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
-rw-r--r-- 3 hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/KYLIN_YEDCQ82BF3=ea63ee219b00bf26b8ecdefcf244738f-9cdedadf1d1e4d36ac92b8f2b7a79432
# 此快照目錄下只有這一個快照,所以我圖省事直接從他的父目錄刪除了
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -rm -R /apps/hbase/data/.hbase-snapshot
18/03/16 17:23:12 INFO fs.TrashPolicyDefault: Moved: 'hdfs://tony_hdfs_ha/apps/hbase/data/.hbase-snapshot' to trash at: hdfs://tony_hdfs_ha/user/hdfs/.Trash/Current/apps/hbase/data/.hbase-snapshotYou have new mail in /var/spool/mail/root
# 再查出其他含有此表名的hdfs目錄,再刪除它即可[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -ls -R /apps/hbase/ | grep --color KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432
-rw-r--r-- 3 hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432/ea63ee219b00bf26b8ecdefcf244738f.KYLIN_YEDCQ82BF3
-rw-r--r-- 3 hbase hdfs 20061969 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/9cdedadf1d1e4d36ac92b8f2b7a79432
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc
-rw-r--r-- 3 hbase hdfs 767 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc/.tableinfo.0000000005
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tmp
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
-rw-r--r-- 3 hbase hdfs 51 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/.regioninfo
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
-rw-r--r-- 3 hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/KYLIN_YEDCQ82BF3=ea63ee219b00bf26b8ecdefcf244738f-9cdedadf1d1e4d36ac92b8f2b7a79432
# 繼續刪除含有此表名的hdfs目錄
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -rm -R /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF318/03/16 17:23:36 INFO fs.TrashPolicyDefault: Moved: 'hdfs://tony_hdfs_ha/apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3' to trash at: hdfs://tony_hdfs_ha/user/hdfs/.Trash/Current/apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
# 再查出其他含有此表名的hdfs目錄,刪除它即可
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -ls -R /apps/hbase/ | grep --color KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc
-rw-r--r-- 3 hbase hdfs 767 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc/.tableinfo.0000000005
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tmp
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
-rw-r--r-- 3 hbase hdfs 51 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/.regioninfo
drwxr-xr-x - hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
-rw-r--r-- 3 hbase hdfs 0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/KYLIN_YEDCQ82BF3=ea63ee219b00bf26b8ecdefcf244738f-9cdedadf1d1e4d36ac92b8f2b7a79432
# 繼續刪除含有此表名的hdfs目錄[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -rm -R /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
18/03/16 17:23:58 INFO fs.TrashPolicyDefault: Moved: 'hdfs://tony_hdfs_ha/apps/hbase/data/data/default/KYLIN_YEDCQ82BF3' to trash at: hdfs://tony_hdfs_ha/user/hdfs/.Trash/Current/apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
【3】重啟新叢集上的所有RegionServer
# 到Ambari重啟hbase的REGION SERVER後,此表不在zk上了[zk: localhost:2181(CONNECTED) 11] ls /hbase/table
[ksai:usertb, hbase:meta, ksai:wps_pc_active_user_domain_info, hbase:namespace, ksai:weekly-installed-android-apps, ksai:wps_android_active_user_domain_info, ksai:test_zz]
# 但是這個表還是在hbase shell能查到,於是又重啟整個hbase叢集,再到hbase shell上面檢視,消失了。而且各個RegionServer也能成功啟動了。
【後續】看完以後,如果你有什麼想法,可以給我留言,大家討論一下本文的這種現象。