Hive基礎sql語法(DDL)
- 經過前面的學習 我們了解到Hive可以使用關系型數據庫來存儲元數據,而且Hive提供了比較完整的SQL功能 ,這篇文章主要介紹Hive基本的sql語法。
首先了解下Hive的數據存儲結構,抽象圖如下:
- 1.Database:Hive中包含了多個數據庫,默認的數據庫為default,對應於HDFS目錄是/user/hadoop/hive/warehouse,可以通過hive.metastore.warehouse.dir參數進行配置(hive-site.xml中配置)
- 2.Table: Hive 中的表又分為內部表和外部表 ,Hive 中的每張表對應於HDFS上的一個目錄,HDFS目錄為:/user/hadoop/hive/warehouse/[databasename.db]/table
- 3.Partition:分區,每張表中可以加入一個分區或者多個,方便查詢,提高效率;並且HDFS上會有對應的分區目錄:
/user/hadoop/hive/warehouse/[databasename.db]/table - 4.Bucket(桶):暫且不講
DDL操作(Data Definition Language)
參考官方文檔: DDL文檔
HiveQL DDL statements are documented here, including:
- CREATE DATABASE/SCHEMA, TABLE, VIEW, FUNCTION, INDEX
- DROP DATABASE/SCHEMA, TABLE, VIEW, INDEX
- TRUNCATE TABLE
- ALTER DATABASE/SCHEMA, TABLE, VIEW
- MSCK REPAIR TABLE (or ALTER TABLE RECOVER PARTITIONS)
- SHOW DATABASES/SCHEMAS, TABLES, TBLPROPERTIES, VIEWS, PARTITIONS, FUNCTIONS, INDEX[ES], COLUMNS, CREATE TABLE
- DESCRIBE DATABASE/SCHEMA, table_name, view_name
一.基於數據庫的DDL操作
1.創建數據庫(Create Database)
- 下面是官網上為我們列出的語法:
Create Database CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name [COMMENT database_comment] [LOCATION hdfs_path] [WITH DBPROPERTIES (property_name=property_value, ...)];
IF NOT EXISTS:加上這句話代表判斷數據庫是否存在,不存在就會創建,存在就不會創建(生產環境建議使用)。
COMMENT:數據庫的描述
LOCATION:創建數據庫的地址,不加默認在/user/hive/warehouse/路徑下
WITH DBPROPERTIES:數據庫的屬性
hive> CREATE DATABASE hive1;
OK
hive> CREATE DATABASE IF NOT EXISTS hive2
> COMMENT "this is ruoze database"
> WITH DBPROPERTIES ("creator"="ruoze", "date"="2018-08-08");
OK
hive> CREATE DATABASE hive3 LOCATION ‘/db_hive3‘;
OK
hive> show databases;
OK
default
hive1
hive2
hive3
# 在HDFS中查看數據庫文件夾
[hadoop@hadoop000 ~]$ hadoop fs -ls /user/hive/warehouse
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:26 /user/hive/warehouse/hive1.db
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:28 /user/hive/warehouse/hive2.db
[hadoop@hadoop000 ~]$ hadoop fs -ls /
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:29 /db_hive3
drwx-wx-wx - hadoop supergroup 0 2018-06-03 15:57 /tmp
drwxr-xr-x - hadoop supergroup 0 2018-06-03 16:43 /user
# 在RDBMS中查看數據庫相關信息
mysql> select * from hive_meta.dbs\G;
*************************** 1. row ***************************
DB_ID: 1
DESC: Default Hive database
DB_LOCATION_URI: hdfs://hadoop000:9000/user/hive/warehouse
NAME: default
OWNER_NAME: public
OWNER_TYPE: ROLE
*************************** 2. row ***************************
DB_ID: 6
DESC: NULL
DB_LOCATION_URI: hdfs://hadoop000:9000/user/hive/warehouse/hive1.db
NAME: hive1
OWNER_NAME: hadoop
OWNER_TYPE: USER
*************************** 3. row ***************************
DB_ID: 7
DESC: this is ruoze database
DB_LOCATION_URI: hdfs://hadoop000:9000/user/hive/warehouse/hive2.db
NAME: hive2
OWNER_NAME: hadoop
OWNER_TYPE: USER
*************************** 4. row ***************************
DB_ID: 8
DESC: NULL
DB_LOCATION_URI: hdfs://hadoop000:9000/db_hive3
NAME: hive3
OWNER_NAME: hadoop
OWNER_TYPE: USER
4 rows in set (0.00 sec)
2.查詢數據庫(Show Databases)
- 下面是官網上為我們列出的語法:
SHOW (DATABASES|SCHEMAS) [LIKE ‘identifier_with_wildcards‘];
hive> show databases;
OK
default
hive1
hive2
hive3
Time taken: 0.047 seconds, Fetched: 4 row(s)
hive> show databases like ‘hive1‘;
OK
hive1
Time taken: 0.035 seconds, Fetched: 1 row(s)
hive> show databases like ‘hive*‘;
OK
hive1
hive2
hive3
Time taken: 0.037 seconds, Fetched: 3 row(s)
3.查詢數據庫信息(Describe Database)
- 下面是官網上為我們列出的語法:
DESCRIBE DATABASE [EXTENDED] db_name;
--describe 可簡寫為desc
DESCRIBE DATABASE db_name:查看數據庫的描述信息和文件目錄位置路徑信息;
EXTENDED:加上數據庫鍵值對的屬性信息。
hive> desc database hive1;
OK
hive1 hdfs://192.168.6.217:9000/user/hive/warehouse/hive1.db hadoop USER
Time taken: 0.039 seconds, Fetched: 1 row(s)
hive> desc database hive2;
OK
hive2 this is ruoze database hdfs://192.168.6.217:9000/user/hive/warehouse/hive2.db hadoop USER
Time taken: 0.041 seconds, Fetched: 1 row(s)
hive> desc database hive3;
OK
hive3 hdfs://192.168.6.217:9000/db_hive3 hadoop USER
Time taken: 0.046 seconds, Fetched: 1 row(s)
hive> desc database extended hive2;
OK
hive2 this is ruoze database hdfs://192.168.6.217:9000/user/hive/warehouse/hive2.db hadoop USER {date=2018-08-08, creator=ruoze}
Time taken: 0.031 seconds, Fetched: 1 row(s)
4.刪除數據庫(Drop Database)
- 下面是官網上為我們列出的語法:
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
RESTRICT:默認是restrict,如果該數據庫還有表存在則報錯;
CASCADE:級聯刪除數據庫(當數據庫還有表時,級聯刪除表後再刪除數據庫) --生產盡量不用。
hive> drop database test;
OK
Time taken: 0.094 seconds
5.修改數據庫信息(Alter Database)
- 下面是官網上為我們列出的語法:
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...); -- (Note: SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role;
-- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)
ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path;
-- (Note: Hive 2.2.1, 2.4.0 and later)
(Note:表示對於版本進行的修改)
hive> alter database hive2 set dbproperties ("update"="jepson");
OK
Time taken: 0.094 seconds
hive> alter database hive2 set owner user hive;
OK
Time taken: 0.072 seconds
# 修改前
hive> desc database extended hive2;
OK
hive2 this is ruoze database hdfs://192.168.6.217:9000/user/hive/warehouse/hive2.db hadoop USER {date=2018-08-08, creator=ruoze}
Time taken: 0.031 seconds, Fetched: 1 row(s)
# 修改後
hive> desc database extended hive2;
OK
hive2 this is ruoze database hdfs://192.168.6.217:9000/user/hive/warehouse/hive2.db hive USER {update=jepson, date=2018-08-08, creator=ruoze}
Time taken: 0.034 seconds, Fetched: 1 row(s)
6.切換數據庫(Use Database)
- 下面是官網上為我們列出的語法:
USE database_name;
hive> use hive1;
OK
Time taken: 0.044 seconds
hive> use default;
OK
Time taken: 0.047 seconds
二.基於表的DDL操作
1.創建表(Create Table)
- 下面是官網上為我們列出的語法:
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)
[(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]
ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
[STORED AS DIRECTORIES]
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY ‘storage.handler.class.name‘ [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)
[AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
1.1.TEMPORARY(臨時表)
- Hive從0.14.0開始提供創建臨時表的功能,表只對當前session有效,session退出後,表自動刪除。
語法:
CREATE TEMPORARY TABLE ...
註意點:
- 如果創建的臨時表表名已存在,那麽當前session引用到該表名時實際用的是臨時表,只有drop或rename臨時表名才能使用原始表;
- 臨時表限制:不支持分區字段和創建索引。
hive> use default;
OK
Time taken: 0.047 seconds
hive> CREATE TEMPORARY TABLE temporary_table (
> id int,
> name string);
OK
Time taken: 0.242 seconds
hive> show tables;
OK
temporary_table
Time taken: 0.044 seconds, Fetched: 1 row(s)
# 退出重新進
hive> use default;
OK
Time taken: 1.054 seconds
hive> show tables;
OK
Time taken: 0.559 seconds
1.2.Managed and External Tables(內部表和外部表)
- Hive上有兩種類型的表,一種是Managed Table(默認的),另一種是External Table(加上EXTERNAL關鍵字)。它倆的主要區別在於:當我們drop表時,Managed Table會同時刪去data(存儲在HDFS上)和meta data(存儲在MySQL),而External Table只會刪meta data。
hive> use default;
OK
Time taken: 1.054 seconds
hive> show tables;
OK
Time taken: 0.559 seconds
# 創建內部表和外部表
hive> create table managed_table(
> id int,
> name string
> );
OK
Time taken: 0.677 seconds
hive> create external table external_table(
> id int,
> name string
> );
OK
Time taken: 0.146 seconds
hive> show tables;
OK
external_table
managed_table
Time taken: 0.05 seconds, Fetched: 2 row(s)
# HDFS中查看
[hadoop@hadoop000 ~]$ hadoop fs -ls /user/hive/warehouse
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2018-06-16 16:40 /user/hive/warehouse/external_table
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:26 /user/hive/warehouse/hive1.db
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:28 /user/hive/warehouse/hive2.db
drwxr-xr-x - hadoop supergroup 0 2018-06-16 16:39 /user/hive/warehouse/managed_table
# MySQL中查看
mysql> select * from hive_meta.tbls\G;
*************************** 1. row ***************************
TBL_ID: 11
CREATE_TIME: 1529138399
DB_ID: 1
LAST_ACCESS_TIME: 0
OWNER: hadoop
RETENTION: 0
SD_ID: 11
TBL_NAME: managed_table
TBL_TYPE: MANAGED_TABLE
VIEW_EXPANDED_TEXT: NULL
VIEW_ORIGINAL_TEXT: NULL
*************************** 2. row ***************************
TBL_ID: 12
CREATE_TIME: 1529138409
DB_ID: 1
LAST_ACCESS_TIME: 0
OWNER: hadoop
RETENTION: 0
SD_ID: 12
TBL_NAME: external_table
TBL_TYPE: EXTERNAL_TABLE
VIEW_EXPANDED_TEXT: NULL
VIEW_ORIGINAL_TEXT: NULL
2 rows in set (0.00 sec)
# 刪除內部表和外部表
hive> drop table managed_table;
OK
Time taken: 1.143 seconds
hive> drop table external_table;
OK
Time taken: 0.265 seconds
# 再次查看
[hadoop@hadoop000 ~]$ hadoop fs -ls /user/hive/warehouse
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2018-06-16 16:40 /user/hive/warehouse/external_table
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:26 /user/hive/warehouse/hive1.db
drwxr-xr-x - hadoop supergroup 0 2018-06-16 15:28 /user/hive/warehouse/hive2.db
mysql> select * from hive_meta.tbls\G;
Empty set (0.00 sec)
ERROR:
No query specified
1.3.COMMENT,ROW FORMAT等其他建表參數
COMMENT :註釋 可以給字段和表加註釋
先看看官網對於ROW FORMAT的描述
: DELIMITED
[FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char]
[LINES TERMINATED BY char]
[NULL DEFINED AS char]
-- (Note: Available in Hive 0.13 and later)
| SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]
-
先看看官網給我們的解釋:用戶在建表的時候可以自定義 SerDe 或者使用自帶的 SerDe。如果沒有指定 ROW FORMAT 或者 ROW FORMAT DELIMITED,將會使用自帶的 SerDe。在建表的時候,用戶還需要為表指定列,用戶在指定表的列的同時也會指定自定義的 SerDe,Hive 通過 SerDe 確定表的具體的列的數據。
-
那麽問題又來了上面這句話又是什麽意思呢?
讓我們來一起看看到底是神馬東東:
DELIMITED:分隔符(可以自定義分隔符);
FIELDS TERMINATED BY char:每個字段之間使用的分割;
例:-FIELDS TERMINATED BY ‘\n‘ 字段之間的分隔符為\n;
COLLECTION ITEMS TERMINATED BY char:集合中元素與元素(array)之間使用的分隔符(collection單例集合的跟接口);
MAP KEYS TERMINATED BY char:字段是K-V形式指定的分隔符;
LINES TERMINATED BY char:每條數據之間由換行符分割(默認[ \n ])。 - 一般情況下LINES TERMINATED BY char我們就使用默認的換行符\n,只需要指定FIELDS TERMINATED BY char。
hive> CREATE TABLE hive_test
> (id int comment ‘this is id‘, name string comment ‘this is name‘ )
> comment ‘this is hive_test‘
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ‘\t‘ ;
OK
Time taken: 0.174 seconds
#為了後面的測試我們創建一張emp表 並導入一些數據
hive> create table emp
> (empno int, ename string, job string, mgr int, hiredate string, salary double, comm double, deptno int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ‘\t‘ ;
OK
Time taken: 0.651 seconds
hive> LOAD DATA LOCAL INPATH ‘/home/hadoop/emp.txt‘ OVERWRITE INTO TABLE emp;
Loading data to table default.emp
Table default.emp stats: [numFiles=1, numRows=0, totalSize=886, rawDataSize=0]
OK
Time taken: 1.848 seconds
1.4.Create Table As Select (CTAS)
- 創建表(拷貝表結構及數據,並且會運行MapReduce作業)
# 復制整張表
hive> create table emp2 as select * from emp;
Query ID = hadoop_20180616171313_fbc318e8-bc70-4b63-84fa-3acd94e4ec3e
Total jobs = 3
...
OK
Time taken: 23.279 seconds
hive> select * from emp2;
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30
7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30
7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30
7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30
7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20
7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30
7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20
7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10
Time taken: 0.138 seconds, Fetched: 14 row(s)
#復制表中的一些字段
hive> create table emp3 as select empno,ename from emp;
Query ID = hadoop_20180616171313_fbc318e8-bc70-4b63-84fa-3acd94e4ec3e
Total jobs = 3
...
OK
Time taken: 16.143 seconds
hive> select * from emp3;
OK
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
Time taken: 0.159 seconds, Fetched: 14 row(s)
1.5.Create Table Like
# Create Table Like 只拷貝表結構
hive> create table emp_like like emp;
OK
Time taken: 0.195 seconds
hive> select * from emp_like;
OK
Time taken: 0.131 seconds
2.展示表 (Show Table與Show Create Table)
- 下面是官網上為我們列出的語法:
SHOW TABLES [IN database_name] [‘identifier_with_wildcards‘];
SHOW CREATE TABLE ([db_name.]table_name|view_name);
hive> show tables;
OK
emp
emp2
emp3
emp_like
hive_test
Time taken: 0.042 seconds, Fetched: 5 row(s)
hive> show tables ‘emp*‘;
OK
emp
emp2
emp3
emp_like
Time taken: 0.053 seconds, Fetched: 4 row(s)
hive> show create table emp;
OK
CREATE TABLE `emp`(
`empno` int,
`ename` string,
`job` string,
`mgr` int,
`hiredate` string,
`salary` double,
`comm` double,
`deptno` int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t‘
STORED AS INPUTFORMAT
‘org.apache.hadoop.mapred.TextInputFormat‘
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat‘
LOCATION
‘hdfs://192.168.6.217:9000/user/hive/warehouse/emp‘
TBLPROPERTIES (
‘COLUMN_STATS_ACCURATE‘=‘true‘,
‘numFiles‘=‘1‘,
‘numRows‘=‘0‘,
‘rawDataSize‘=‘0‘,
‘totalSize‘=‘657‘,
‘transient_lastDdlTime‘=‘1529140756‘)
Time taken: 0.245 seconds, Fetched: 24 row(s)
3.查詢表信息(Describe Table)
- 下面是官網上為我們列出的語法:
DESCRIBE [EXTENDED|FORMATTED]
table_name[.col_name ( [.field_name] | [.‘$elem$‘] | [.‘$key$‘] | [.‘$value$‘] )* ];
-- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)
desc formatted table_name; 比較常用
hive> desc emp;
OK
empno int
ename string
job string
mgr int
hiredate string
salary double
comm double
deptno int
Time taken: 0.213 seconds, Fetched: 8 row(s)
hive> desc formatted emp;
OK
# col_name data_type comment
empno int
ename string
job string
mgr int
hiredate string
salary double
comm double
deptno int
# Detailed Table Information
Database: default
Owner: hadoop
CreateTime: Sat Jun 16 17:13:05 CST 2018
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://192.168.6.217:9000/user/hive/warehouse/emp
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 0
rawDataSize 0
totalSize 657
transient_lastDdlTime 1529140756
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.214 seconds, Fetched: 39 row(s)
hive> desc EXTENDED emp;
OK
empno int
ename string
job string
mgr int
hiredate string
salary double
comm double
deptno int
Detailed Table Information Table(tableName:emp, dbName:default, owner:hadoop, createTime:1529140385, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:empno, type:int, comment:null), FieldSchema(name:ename, type:string, comment:null), FieldSchema(name:job, type:string, comment:null), FieldSchema(name:mgr, type:int, comment:null), FieldSchema(name:hiredate, type:string, comment:null), FieldSchema(name:salary, type:double, comment:null), FieldSchema(name:comm, type:double, comment:null), FieldSchema(name:deptno, type:int, comment:null)], location:hdfs://192.168.6.217:9000/user/hive/warehouse/emp, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format= , field.delim=
Time taken: 0.21 seconds, Fetched: 10 row(s)
4.修改表(Alter Table)
- 下面是官網上為我們列出的語法:
ALTER TABLE table_name RENAME TO new_table_name;
ALTER TABLE table_name SET TBLPROPERTIES table_properties;
table_properties:
: (property_name = property_value, property_name = property_value, ... )
ALTER TABLE table_name SET TBLPROPERTIES (‘comment‘ = new_comment);
...
hive> alter table hive_test rename to new_hive_test;
OK
Time taken: 0.262 seconds
hive> ALTER TABLE table_name SET TBLPROPERTIES ("creator"="ruoze", "date"="2018-06-16");
FAILED: SemanticException [Error 10001]: Table not found default.table_name
hive> ALTER TABLE new_hive_test SET TBLPROPERTIES ("creator"="ruoze", "date"="2018-06-16");
OK
Time taken: 0.246 seconds
hive> ALTER TABLE new_hive_test SET TBLPROPERTIES (‘comment‘ = ‘This is new_hive_test Table‘);
# 再次查看表
hive> desc formatted new_hive_test;
OK
# col_name data_type comment
id int this is id
name string this is name
# Detailed Table Information
Database: default
Owner: hadoop
CreateTime: Sat Jun 16 17:09:19 CST 2018
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://192.168.6.217:9000/user/hive/warehouse/new_hive_test
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE false
comment This is new_hive_test Table
creator ruoze
date 2018-06-16
last_modified_by hadoop
last_modified_time 1529143021
numFiles 0
numRows -1
rawDataSize -1
totalSize 0
transient_lastDdlTime 1529143021
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 0.188 seconds, Fetched: 38 row(s)
5.截斷表(Truncate Table)
- 下面是官網上為我們列出的語法:
TRUNCATE TABLE table_name [PARTITION partition_spec]; partition_spec: : (partition_column = partition_col_value, partition_column = partition_col_value, ...)
Truncate Table用處不多
hive> select * from emp3; OK 7369 SMITH 7499 ALLEN 7521 WARD 7566 JONES 7654 MARTIN 7698 BLAKE 7782 CLARK 7788 SCOTT 7839 KING 7844 TURNER 7876 ADAMS 7900 JAMES 7902 FORD 7934 MILLER Time taken: 0.148 seconds, Fetched: 14 row(s) hive> truncate table emp3; OK Time taken: 0.241 seconds hive> select * from emp3; OK Time taken: 0.12 seconds
6.刪除表(Drop Table)
- 下面是官網上為我們列出的語法:
DROP TABLE [IF EXISTS] table_name [PURGE]; -- (Note: PURGE available in Hive 0.14.0 and later)
1.指定PURGE後,數據不會放到回收箱,會直接刪除。
2.DROP TABLE刪除此表的元數據和數據。如果配置了垃圾箱(並且未指定PURGE),則實際將數據移至.Trash / Current目錄。元數據完全丟失。
3.刪除EXTERNAL表時,表中的數據不會從文件系統中刪除。hive> drop table emp3; OK Time taken: 0.866 seconds hive> show tables; OK emp emp2 emp_like new_hive_test Time taken: 0.036 seconds, Fetched: 4 row(s)
參考:https://blog.csdn.net/yu0_zhang0/article/details/78976021
關於表的DDL操作還有很多,有關分區表的操作還沒詳解 後面會單獨寫一篇分區表
Hive基礎sql語法(DDL)