HBase 修改 TTL 屬性以釋放空間
40. Time To Live (TTL)
ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row - even the current one. The TTL time encoded in the HBase for the row is specified in UTC.
Store files which contains only expired rows are deleted on minor compaction. Setting
hbase.store.delete.expired.storefile
tofalse
disables this feature. Setting minimum number of versions to other than 0 also disables this.See HColumnDescriptor for more information.
Recent versions of HBase also support setting time to live on a per cell basis. See HBASE-10560 for more information. Cell TTLs are submitted as an attribute on mutation requests (Appends, Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied to all cells updated on the server by the operation. There are two notable differences between cell TTL handling and ColumnFamily TTLs:
Cell TTLs are expressed in units of milliseconds instead of seconds.
A cell TTLs cannot extend the effective lifetime of a cell beyond a ColumnFamily level TTL setting.
40.生存時間(TTL)
ColumnFamilies可以設定TTL長度(以秒為單位),HBase將在到達到期時間後自動刪除行。這適用於行的所有版本 - 即使是當前版本。在HBase中為行編碼的TTL時間以UTC指定。
在輕微壓縮時刪除僅包含過期行的儲存檔案。設定hbase.store.delete.expired.storefile為false禁用此功能。將最小版本數設定為0以外也會禁用此功能。
最新版本的HBase還支援基於每個單元格設定生存時間。使用Mutation#setTTL將cell TTL作為突變請求(Appends,Increments,Puts等)的屬性提交。如果設定了TTL屬性,它將應用於操作在伺服器上更新的所有單元格。
Cell的TTL與Column family的TTL區別:
- Column family的TTL以秒為單位,cell的TTL以毫秒為單位
- 如果有有cell級別的TTL,則cell的TTL override CF的TTL; 但是不能超出CF級別的TTL
以上內容來自Apache的hbase官網,可供參考。以下實際操作一下。
建立表:
create 'dc:event',{NAME => 'f1'},{NAME => 'cf'},{NAME => 'f2'}
查看錶結構:
desc "dc:event"
'dc:event', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1',COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
put 資料
put 'dc:event','866925023233621','f1:eventid','866925023233621'
put 'dc:event','866925023233622','f1:eventid','866925023233621'
put 'dc:event','866925023233623','f1:eventid','866925023233621'
put 'dc:event','866925023233624','f1:eventid','866925023233621'
put 'dc:event','866925023233625','f1:eventid','866925023233621'
put 'dc:event','866925023233626','f1:eventid','866925023233621'
put 'dc:event','866925023233627','f1:eventid','866925023233621'
put 'dc:event','866925023233628','f1:eventid','866925023233621'
put 'dc:event','866925023233629','f1:eventid','866925023233621'
put 'dc:event','866925023233630','f1:eventid','866925023233621'
put 'dc:event','8669250232336-21','cf:eventid','866925023233621'
put 'dc:event','8669250232336-22','cf:eventid','866925023233621'
put 'dc:event','8669250232336-23','cf:eventid','866925023233621'
put 'dc:event','8669250232336-24','cf:eventid','866925023233621'
put 'dc:event','8669250232336-25','cf:eventid','866925023233621'
put 'dc:event','8669250232336-26','cf:eventid','866925023233621'
put 'dc:event','8669250232336-27','cf:eventid','866925023233621'
put 'dc:event','8669250232336-28','cf:eventid','866925023233621'
put 'dc:event','8669250232336-29','cf:eventid','866925023233621'
put 'dc:event','8669250232336-30','cf:eventid','866925023233621'
put 'dc:event','866925023233-6-21','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-22','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-23','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-24','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-25','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-26','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-27','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-28','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-29','f2:eventid','866925023233621'
put 'dc:event','866925023233-6-30','f2:eventid','866925023233621'
scan 'dc:event'
hbase(main):048:0> scan 'dc:event'
ROW COLUMN+CELL
866925023233-6-21 column=f2:eventid, timestamp=1536805384815, value=866925023233621
866925023233-6-22 column=f2:eventid, timestamp=1536805384873, value=866925023233621
866925023233-6-23 column=f2:eventid, timestamp=1536805384881, value=866925023233621
866925023233-6-24 column=f2:eventid, timestamp=1536805384890, value=866925023233621
866925023233-6-25 column=f2:eventid, timestamp=1536805384898, value=866925023233621
866925023233-6-26 column=f2:eventid, timestamp=1536805384907, value=866925023233621
866925023233-6-27 column=f2:eventid, timestamp=1536805384922, value=866925023233621
866925023233-6-28 column=f2:eventid, timestamp=1536805384936, value=866925023233621
866925023233-6-29 column=f2:eventid, timestamp=1536805384946, value=866925023233621
866925023233-6-30 column=f2:eventid, timestamp=1536805384958, value=866925023233621
8669250232336-21 column=cf:eventid, timestamp=1536805310816, value=866925023233621
8669250232336-22 column=cf:eventid, timestamp=1536805310850, value=866925023233621
8669250232336-23 column=cf:eventid, timestamp=1536805310861, value=866925023233621
8669250232336-24 column=cf:eventid, timestamp=1536805310870, value=866925023233621
8669250232336-25 column=cf:eventid, timestamp=1536805310881, value=866925023233621
8669250232336-26 column=cf:eventid, timestamp=1536805310890, value=866925023233621
8669250232336-27 column=cf:eventid, timestamp=1536805310911, value=866925023233621
8669250232336-28 column=cf:eventid, timestamp=1536805310918, value=866925023233621
8669250232336-29 column=cf:eventid, timestamp=1536805310930, value=866925023233621
8669250232336-30 column=cf:eventid, timestamp=1536805310937, value=866925023233621
866925023233621 column=f1:eventid, timestamp=1536805258985, value=866925023233621
866925023233622 column=f1:eventid, timestamp=1536805259053, value=866925023233621
866925023233623 column=f1:eventid, timestamp=1536805259060, value=866925023233621
866925023233624 column=f1:eventid, timestamp=1536805259070, value=866925023233621
866925023233625 column=f1:eventid, timestamp=1536805259078, value=866925023233621
866925023233626 column=f1:eventid, timestamp=1536805259084, value=866925023233621
866925023233627 column=f1:eventid, timestamp=1536805259112, value=866925023233621
866925023233628 column=f1:eventid, timestamp=1536805259119, value=866925023233621
866925023233629 column=f1:eventid, timestamp=1536805259127, value=866925023233621
866925023233630 column=f1:eventid, timestamp=1536805259143, value=866925023233621
30 row(s) in 0.0920 seconds
以下內容設定TTL值,
1.disable 'dc:event'
2. alter "dc:event" ,NAME=>'cf',TTL=>600
alter "dc:event" ,NAME=>'f1',TTL=>600
3. enable 'dc:event'
4. scan 'dc:event'
ROW COLUMN+CELL
866925023233-6-21 column=f2:eventid, timestamp=1536805384815, value=866925023233621
866925023233-6-22 column=f2:eventid, timestamp=1536805384873, value=866925023233621
866925023233-6-23 column=f2:eventid, timestamp=1536805384881, value=866925023233621
866925023233-6-24 column=f2:eventid, timestamp=1536805384890, value=866925023233621
866925023233-6-25 column=f2:eventid, timestamp=1536805384898, value=866925023233621
866925023233-6-26 column=f2:eventid, timestamp=1536805384907, value=866925023233621
866925023233-6-27 column=f2:eventid, timestamp=1536805384922, value=866925023233621
866925023233-6-28 column=f2:eventid, timestamp=1536805384936, value=866925023233621
866925023233-6-29 column=f2:eventid, timestamp=1536805384946, value=866925023233621
866925023233-6-30 column=f2:eventid, timestamp=1536805384958, value=866925023233621
10 row(s) in 0.0740 seconds
對錶中原有的cf,f1,f2 列中的cf,f1列設定ttl,時間到之後,cf、f1列的資料會自動清除,f2的資料由於沒有設定ttl時間,資料依然還在。
表的TTL修改前後對比:
修改HBASE ttl shell
#!/bin/bash -l
# 針對這一步驟的操作是否需要做回滾操作
# 如果需要,需要檢視生產的對應表的ttl,回滾時資料無法回滾
WB_DIR=$(cd $(dirname $0); pwd)
HBASE_NAMESPACE='hochoy'
origin_tables="tabTest1 tabTest2 tabTest3"
alter_ttl="alter_hbase.script"
get_ttl_value(){
years=${1}
ttl=FOREVER
ttl=$(echo "scale = 0; 60 * 60 * 24 * 365 * ${years} " | bc)
echo ${ttl%\.*}
}
gen_alt_script(){
ttl=${1}
echo ''>${WB_DIR}/${alter_ttl}
for table in ${origin_tables}
do
echo "desc '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
echo "disable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
echo "alter '${HBASE_NAMESPACE}:${table}', {NAME=>'f',TTL=>${ttl} } ">>${WB_DIR}/${alter_ttl}
echo "enable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
echo "desc '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
done
echo "exit">>${WB_DIR}/${alter_ttl}
}
if [ $# -lt 1 ]; then
echo "Usage:
Input value of TTL please!
"
exit
fi
if [ "${1}" = "FOREVER" ] ;then
gen_alt_script FOREVER
else
ttl=$(get_ttl_value ${1})
gen_alt_script $ttl
fi
cat ${WB_DIR}/${alter_ttl}
hbase shell ${WB_DIR}/${alter_ttl}