Hive 簡單SQL
(1)內部表和外部表的區別
默認創建的是內部表,可以指定目錄,如果不指定則會創建默認目錄,一旦drop,該目錄和數據都會被刪除
創建external table 的時候需要指定存放目錄,並且drop表的時候,不會刪除該目錄和目錄下的數據,只會刪除元信息
#創建一個外部表
0: jdbc:hive2://192.168.163.102:10000> create external table t10(c1 int,c2 string) row format delimited fields terminated by ',' stored as testfile location "/dir1";
[root@Darren2 tmp]# hdfs dfs -put file1 /dir1
[root@Darren2 tmp]# hdfs dfs -ls -R /dir1
-rw-r--r-- 1 root supergroup 24 2017-11-25 20:53 /dir1/file1
0: jdbc:hive2://192.168.163.102:10000> drop table t10;
No rows affected (0.41 seconds)
[root@Darren2 tmp]# hdfs dfs -ls -R /dir1
17/11/25 20:56:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r-- 1 root supergroup 24 2017-11-25 20:53 /dir1/file1
#創建一個默認的內部表
0: jdbc:hive2://192.168.163.102:10000> create table t2(c1 int,c2 string) row format delimited fields terminated by ',' stored as textfile;
(2)Hive支持的存儲文件格式
textfile, sequencefile, orc, parquet,avro
0: jdbc:hive2://192.168.163.102:10000> create table t5(c1 int,c2 string) row format delimited fields terminated by ',' stored as sequencefile ;
0: jdbc:hive2://192.168.163.102:10000> insert into t5 select * from t4;
#作為sequencefile格式存儲的文件無法直接查看其內容
[root@Darren2 tmp]# hdfs dfs -ls /user/hive/warehouse/testdb1.db/t5/
-rwxr-xr-x 1 root supergroup 146 2017-11-26 03:03 /user/hive/warehouse/testdb1.db/t5/000000_0
0: jdbc:hive2://192.168.163.102:10000> desc formatted t5;
2.導入數據到hive
語法:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
(1) 直接把本地的文件導入到hive中的表
0: jdbc:hive2://192.168.163.102:10000> load data local inpath '/tmp/file1' into table t1;
0: jdbc:hive2://192.168.163.102:10000> select * from t1;
+--------+--------+--+
| t1.c1 | t1.c2 |
+--------+--------+--+
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |
| 4 | ddd |
+--------+--------+--+
(2)加載數據到表中,但是會覆蓋表中所有數據,實質是覆蓋t1目錄下的所有文件
0: jdbc:hive2://192.168.163.102:10000> load data local inpath '/tmp/file3' overwrite into table t1;
No rows affected (0.597 seconds)
0: jdbc:hive2://192.168.163.102:10000> select * from t1;
+--------+---------+--+
| t1.c1 | t1.c2 |
+--------+---------+--+
| 1 | yiyi |
| 2 | erer |
| 3 | sansan |
| 4 | sisi |
+--------+---------+--+
4 rows selected (0.073 seconds)
(3)把hdfs上的文件導入到hive中的表
[root@Darren2 tmp]# cat /tmp/file2
5,eee
[root@Darren2 tmp]# hdfs dfs -put /tmp/file2 /user/hive/warehouse/testdb1.db/t1
0: jdbc:hive2://192.168.163.102:10000> load data inpath '/user/hive/warehouse/testdb1.db/t1/file2' into table t1;
0: jdbc:hive2://192.168.163.102:10000> select * from t1;
+--------+--------+--+
| t1.c1 | t1.c2 |
+--------+--------+--+
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |
| 4 | ddd |
| 5 | eee |
+--------+--------+--+
(4)根據一個表創建另一個表,同時插入數據
0: jdbc:hive2://192.168.163.102:10000> create table t2 as select * from t1;
(5)根據一個表先創建表結構,後插入數據
0: jdbc:hive2://192.168.163.102:10000> create table t3 like t1;
0: jdbc:hive2://192.168.163.102:10000> insert into t3 select * from t1;
3,從查詢結果導數據到文件系統中
(1)從查詢結果導數據到HDFS文件系統中
0: jdbc:hive2://192.168.163.102:10000> select * from t1;
+--------+---------+--+
| t1.c1 | t1.c2 |
+--------+---------+--+
| 1 | yiyi |
| 2 | erer |
| 3 | sansan |
| 4 | sisi |
+--------+---------+--+
0: jdbc:hive2://192.168.163.102:10000> insert overwrite directory '/user/hive/warehouse/tmp' select * from testdb1.t1;
[root@Darren2 tmp]# hdfs dfs -ls -R /user/hive/warehouse/tmp
-rwxr-xr-x 1 root supergroup 30 2017-11-26 00:25 /user/hive/warehouse/tmp/000000_0
[root@Darren2 tmp]# hdfs dfs -get /user/hive/warehouse/tmp/000000_0 /tmp/
導出的文件的分隔符對應的ASCII碼是Ctrl+a 即\001
[root@Darren2 tmp]# vim /tmp/000000_0
1^Ayiyi
2^Aerer
3^Asansan
4^Asisi
利用這個文件創建一個外部表,使用\001為分隔符
0: jdbc:hive2://192.168.163.102:10000> create external table t5(c1 int,c2 string) row format delimited fields terminated by '\001' location '/user/hive/warehouse/tmp/';
0: jdbc:hive2://192.168.163.102:10000> select * from t5;
+--------+---------+--+
| t5.c1 | t5.c2 |
+--------+---------+--+
| 1 | yiyi |
| 2 | erer |
| 3 | sansan |
| 4 | sisi |
+--------+---------+--+
(2)從查詢結果導數據到本地
0: jdbc:hive2://192.168.163.102:10000> insert overwrite local directory '/tmp' select * from testdb1.t1;
[root@Darren2 tmp]# ls /tmp/000000_0
/tmp/000000_0
4 insert
(1) insert 插入數據的實質是建立一個文件
0: jdbc:hive2://192.168.163.102:10000> insert into t5 values(4,'sisi');
No rows affected (17.987 seconds)
0: jdbc:hive2://192.168.163.102:10000> dfs -ls /user/hive/warehouse/testdb1.db/t5 ;
+----------------------------------------------------------------------------------------------------------------+--+
| DFS Output |
+----------------------------------------------------------------------------------------------------------------+--+
| Found 2 items |
| -rwxr-xr-x 1 root supergroup 146 2017-11-26 03:03 /user/hive/warehouse/testdb1.db/t5/000000_0 |
| -rwxr-xr-x 1 root supergroup 106 2017-11-26 04:22 /user/hive/warehouse/testdb1.db/t5/000000_0_copy_1 |
+----------------------------------------------------------------------------------------------------------------+--+
Hive 簡單SQL