大數據筆記(十三)——常見的NoSQL數據庫之HBase數據庫(A)
一.HBase的表結構和體系結構
1.HBase的表結構
把所有的數據存到一張表中。通過犧牲表空間,換取良好的性能。
HBase的列以列族的形式存在。每一個列族包括若幹列
2.HBase的體系結構
主從結構:
主節點:HBase
從節點:RegionServer 包含多個Region,一個列族就是一個Region
HBase在ZK中保存數據
(*)配置信息、HBase集群結構信息
(*)表的元信息
(*)實現HBase的HA:high avaibility 高可用性
二.搭建HBase的本地模式和偽分布模式
1.解壓:
tar -zxvf hbase-1.3.1-bin.tar.gz -C ~/training/
2.設置環境變量: vi ~/.bash_profile
HBASE_HOME=/root/training/hbase-1.3.1
export HBASE_HOME
PATH=$HBASE_HOME/bin:$PATH
export PATH
使文件生效:source ~/.bash_profile
本地模式 不需要HDFS、直接把數據存在操作系統
hbase-env.sh
export JAVA_HOME=/root/training/jdk1.8.0_144
hbase-site.xml
<property> <name>hbase.rootdir</name> <value>file:///root/training/hbase-1.3.1/data</value> </property>
偽分布模式
hbase-env.sh 添加下面這一行,使用自帶的Zookeeper
export HBASE_MANAGES_ZK=true
hbase-site.xml 把本地模式的property刪除,添加下列配置
<property>
<name>hbase.rootdir</name> <value>hdfs://192.168.153.11:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <!--Zookeeper的地址--> <name>hbase.zookeeper.quorum</name> <value>192.168.153.11</value> </property> <property> <!--數據冗余度--> <name>dfs.replication</name> <value>1</value> </property>
regionservers
192.168.153.11
可以在web上查看
三.搭建HBase的全分布模式和HA
在putty中設置bigdata12 bigdata13 bigdata14 時間同步:date -s 2018-03-10
主節點:hbase-site.xml
<property> <name>hbase.rootdir</name> <value>hdfs://192.168.153.12:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>192.168.153.12</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <!--解決時間不同步的問題:允許的時間誤差最大值--> <name>hbase.master.maxclockskew</name> <value>180000</value> </property>
regionservers
192.168.154.13
192.168.153.14
拷貝到13和14上:
scp -r hbase-1.3.1/ root@bigdata13:/root/training
scp -r hbase-1.3.1/ root@bigdata14:/root/training
四.HBase在Zookeeper中保存的數據和HA的實現
HA的實現:
不需要額外配置,只用在其中一個從節點上單點啟動Hmaster
bigdata13:hbase-daemon.sh start master
五.操作HBase
1.Web Console網頁:端口:16010
2.命令行
開啟hbase: start-hbase.sh
開啟hbase shell
建表:
hbase(main):001:0> create ‘students‘,‘info‘,‘grade‘ //創建表 0 row(s) in 1.7020 seconds => Hbase::Table - students hbase(main):002:0> desc ‘students‘ //查看表結構 Table students is ENABLED students COLUMN FAMILIES DESCRIPTION {NAME => ‘grade‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODIN G => ‘NONE‘, TTL => ‘FOREVER‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATI ON_SCOPE => ‘0‘} {NAME => ‘info‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘NONE‘, TTL => ‘FOREVER‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATIO N_SCOPE => ‘0‘} 2 row(s) in 0.2540 seconds hbase(main):003:0> describe ‘students‘ Table students is ENABLED students COLUMN FAMILIES DESCRIPTION {NAME => ‘grade‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODIN G => ‘NONE‘, TTL => ‘FOREVER‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATI ON_SCOPE => ‘0‘} {NAME => ‘info‘, BLOOMFILTER => ‘ROW‘, VERSIONS => ‘1‘, IN_MEMORY => ‘false‘, KEEP_DELETED_CELLS => ‘FALSE‘, DATA_BLOCK_ENCODING => ‘NONE‘, TTL => ‘FOREVER‘, COMPRESSION => ‘NONE‘, MIN_VERSIONS => ‘0‘, BLOCKCACHE => ‘true‘, BLOCKSIZE => ‘65536‘, REPLICATIO N_SCOPE => ‘0‘} 2 row(s) in 0.0240 seconds
desc和describe的區別:
desc是SQL*PLUS語句
describe是SQL語句
分析students表的結構
查看有哪些表:list
插入數據:put
put ‘students‘,‘stu001‘,‘info:name‘,‘Tom‘ put ‘students‘,‘stu001‘,‘info:age‘,‘24‘ put ‘students‘,‘stu001‘,‘grade:math‘,‘85‘ put ‘students‘,‘stu002‘,‘info:name‘,‘Mary‘ put ‘students‘,‘stu002‘,‘info:age‘,‘28‘
查詢數據:
scan 相當於:select * from students
get 相當於 select * from students where rowkey=??
清空表中的數據
delete DML(可以回滾)
truncate DDL(不可以回滾)
補充:DDL:數據定義語言,如 create/alter/drop/truncate/comment/grant等
DML:數據操作語言,如select/delete/insert/update/explain plan等
DCL:數據控制語言,如commit/roollback
2、delete會產生碎片;truncate不會
3、delete不會釋放空間;truncate會
4、delete可以閃回(flashback),truncate不可以閃回
truncate ‘students‘ -----> 本質: 先刪除表,再重建
日誌:
Truncating ‘students‘ table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.0840 seconds
3.JAVA API
修改etc文件:C:\Windows\System32\drivers\etc
添加一行:192.168.153.11 bigdata11
TestHBase.java
package demo; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.MasterNotRunningException; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.ZooKeeperConnectionException; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.ResultScanner; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.util.Bytes; import org.junit.Test; import io.netty.util.internal.SystemPropertyUtil; /** * 1.需要一個jar包: hamcrest-core-1.3.jar * 2.修改windows host文件 * C:\Windows\System32\drivers\etc\hosts * 192.168.153.11 bigdata11 * @author YOGA * */ public class TestHBase { @Test public void testCreateTable() throws Exception{ //配置ZK的地址信息 Configuration conf = new Configuration(); //hbase-site.xml文件裏 conf.set("hbase.zookeeper.quorum", "192.168.153.11"); //得到HBsase客戶端 HBaseAdmin client = new HBaseAdmin(conf); //創建表的描述符 HTableDescriptor htd = new HTableDescriptor(TableName.valueOf("mytable")); //添加列族 htd.addFamily(new HColumnDescriptor("info")); htd.addFamily(new HColumnDescriptor("grade")); //建表 client.createTable(htd); client.close(); } @Test public void testPut() throws Exception{ //配置ZK的地址信息 Configuration conf = new Configuration(); conf.set("hbase.zookeeper.quorum", "192.168.153.11"); //得到HTable客戶端 HTable client = new HTable(conf, "mytable"); //構造一個Put對象,參數:rowKey Put put = new Put(Bytes.toBytes("id001")); //put.addColumn(family, //列族 // qualifier, //列 // value) ?//列對應的值 put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("Tom")); client.put(put); //client.put(List<Put>); client.close(); } @Test public void testGet() throws Exception{ //配置ZK的地址信息 Configuration conf = new Configuration(); conf.set("hbase.zookeeper.quorum", "192.168.153.11"); //得到HTable客戶端 HTable client = new HTable(conf, "mytable"); //構造一個Get對象 Get get = new Get(Bytes.toBytes("id001")); //查詢 Result result = client.get(get); //取出數據 String name = Bytes.toString(result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"))); System.out.println(name); client.close(); } @Test public void testScan() throws Exception{ //配置ZK的地址信息 Configuration conf = new Configuration(); conf.set("hbase.zookeeper.quorum", "192.168.153.11"); //得到HTable客戶端 HTable client = new HTable(conf, "mytable"); //定義一個掃描器 Scan scan = new Scan(); //scan.setFilter(filter); 定義一個過濾器 //通過掃描器查詢數據 ResultScanner rScanner = client.getScanner(scan); for (Result result : rScanner) { String name = Bytes.toString(result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"))); System.out.println(name); } } }
執行以上test,結果(最後一個)
大數據筆記(十三)——常見的NoSQL數據庫之HBase數據庫(A)