1. 程式人生 > 其它 >hive sql alter table 修改表操作小記

hive sql alter table 修改表操作小記

技術標籤:sqlhivessql大資料hive

ALTER TABLE僅僅會修改表的元資料,但是不會對錶資料本身作任何修改

有全量表如下:

create external table test.class_info(
	class string,
	student array<string>,
	user_info map<string, int>,
	position struct<province:string, city:string, district:string>
)
row format delimited fields terminated
by ',' collection items terminated by '_' map keys terminated by ':' lines terminated by '\n' stored as textfile location '/big-data/test/user_info';

有分割槽表如下:

create external table test.class_info_partition(
	class string,
	student array<string>,
	user_info map<string, int>,
	position struct<
province:string, city:string, district:string> ) partitioned by (date_key string) row format delimited fields terminated by ',' collection items terminated by '_' map keys terminated by ':' lines terminated by '\n' stored as textfile location '/big-data/test/user_info_partition';

一、表重新命名

語法:

ALTER TABLE
ods_table_name RENAME TO new_table_name;

例項:

hive> use test;
OK
Time taken: 0.835 seconds
hive> show tables;
OK
class_info
Time taken: 0.112 seconds, Fetched: 1 row(s)
hive> alter table class_info rename to school_class_info;
OK
Time taken: 0.179 seconds
hive> show tables;
OK
school_class_info
Time taken: 0.028 seconds, Fetched: 1 row(s)

二、增加、修改、刪除表分割槽

命令僅可用於分割槽表,全量表使用報錯如下:

hive> ALTER TABLE school_class_info ADD IF NOT EXISTS PARTITION (class = 'Grade-two-of-junior-High-school') LOCATION '/big-data/test/user_info/Grade-two-of-junior-High-school' ;

在這裡插入圖片描述

1、增加表分割槽

語法:
正常情況下需要分割槽不存在,如果為了防止分割槽存在報錯可以加 IF NOT EXISTS 關鍵字

ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION (分割槽欄位名 = 'XXX') LOCATION '$PATH'

例項:

    > ALTER TABLE class_info_partition ADD IF NOT EXISTS PARTITION (date_key = '2021-01-06') LOCATION '/big-data/test/user_info/2021-01-06' ;
OK
Time taken: 0.74 seconds

結果:

hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.146 seconds, Fetched: 1 row(s)

2、刪除表分割槽

基礎資料準備

hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.146 seconds, Fetched: 1 row(s)
hive> ALTER TABLE class_info_partition ADD IF NOT EXISTS PARTITION (date_key = '2021-01-07') LOCATION '/big-data/test/user_info/2021-01-07' ;
OK
Time taken: 0.683 seconds
hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
date_key=2021-01-07
Time taken: 0.125 seconds, Fetched: 2 row(s)

語法:

ALTER TABLE partition_table_name DROP IF EXISTS PARTITION (分割槽欄位名 = 'XXX');

例項:

ALTER TABLE test.class_info_partition DROP IF EXISTS PARTITION (date_key='2021-01-07');

結果:

hive> ALTER TABLE test.class_info_partition DROP [IF EXISTS] PARTITION (date_key='2021-01-07');
Dropped the partition date_key=2021-01-07
OK
Time taken: 1.106 seconds
hive> show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.121 seconds, Fetched: 1 row(s)

修改表分割槽

語法:

ALTER TABLE class_info_partition PARTITION (分割槽欄位名 = 'XXX') SET LOCATION '$PATH' ;

例項:

    > show partitions test.class_info_partition;
OK
date_key=2021-01-06
Time taken: 0.115 seconds, Fetched: 1 row(s)
hive> ALTER TABLE class_info_partition PARTITION (date_key = '2021-01-06') SET LOCATION '/big-data/test/user_info/2021-01-07' ;
OK
Time taken: 0.433 seconds

結果:
我們查詢date_key = '2021-01-06'資料時發現沒有資料,在date_key = '2021-01-07載入資料後資料重新出現。這個命令不會將資料從舊的路徑移走,也不會刪除舊的資料。

對於內部表即使是使用ALTER TABLE table_name ADD PARTITION語句增加分割槽,分割槽內的資料和元資料資訊也是會被同時刪除,對於外部表分割槽內的資料不會被刪除。

三、增加、修改、刪除表字段,修改列名/型別/位置/註釋

1、為表新增一個欄位或者多個欄位

語法:

alter table table_name add columns( 新增欄位名 新增欄位型別 comment '$新增欄位註釋');

例項:

# 新增1列
hive> desc school_class_info;
OK
class               	string
student             	array<string>
user_info           	map<string,int>
position            	struct<province:string,city:string,district:string>
Time taken: 0.079 seconds, Fetched: 4 row(s)
hive> alter table school_class_info add columns(
    >         user_id bigint comment '使用者ID'
    > );
OK
Time taken: 0.209 seconds
hive> desc school_class_info;
OK
class               	string
student             	array<string>
user_info           	map<string,int>
position            	struct<province:string,city:string,district:string>
user_id             	bigint              	??ID
Time taken: 0.056 seconds, Fetched: 5 row(s)

# 新增多列
alter table school_class_info add columns(
        name string comment '使用者名稱稱',
        city string comment '城市',
        sex string comment '使用者性別',
        age string comment '使用者年齡',
        phone string comment '使用者手機',
        email string comment '使用者郵箱',
        unqiue_id string comment '身份證ID'
);
hive> desc school_class_info;
OK
class               	string
student             	array<string>
user_info           	map<string,int>
position            	struct<province:string,city:string,district:string>
user_id             	bigint              	??ID
name                	string              	????
city                	string              	??
sex                 	string              	????
age                 	string              	????
phone               	string              	????
email               	string              	????
unqiue_id           	string              	???ID
Time taken: 0.06 seconds, Fetched: 12 row(s)

2、刪除列/更新列

語法:

alter table table_name replace columns(
	保留欄位1 保留欄位型別1 comment '保留欄位型別1註釋',
	保留欄位2 保留欄位型別2 comment '保留欄位型別2註釋',
	保留欄位3 保留欄位型別3 comment '保留欄位型別3註釋',
	保留欄位4 保留欄位型別4 comment '保留欄位型別4註釋',
	保留欄位5 保留欄位型別5 comment '保留欄位型別5註釋',
	保留欄位6 保留欄位型別6 comment '保留欄位型別6註釋',
	......
)

例項:

# 由於資料脫敏,我們需要刪除unqiue_id列
hive> alter table school_class_info replace columns(
    >     class string comment '',
    >     student array<string> comment '',
    >     user_info map<string,int> comment '',
    >     position struct<province:string,city:string,district:string> comment '',
    >     user_id bigint comment '使用者ID',
    >     name string comment '使用者名稱稱',
    >     city string comment '城市',
    >     sex string comment '使用者性別',
    >     age int comment '使用者年齡',
    >     phone string comment '使用者手機',
    >     email string comment '使用者郵箱'
    > );
OK
Time taken: 0.24 seconds
hive> desc school_class_info;
OK
class               	string
student             	array<string>
user_info           	map<string,int>
position            	struct<province:string,city:string,district:string>
user_id             	bigint              	??ID
name                	string              	????
city                	string              	??
sex                 	string              	????
age                 	string              	????
phone               	string              	????
email               	string              	????
Time taken: 0.057 seconds, Fetched: 11 row(s)
# 相當於使用replace重新將表的列給更新替換了

結果:我們發現unqiue_id欄位已經被刪除

3、修改列名/型別/位置/註釋

語法:

#修改欄位名,型別,註釋
alter table table_name change [column] 現欄位1名 修改後欄位1名 修改後段1型別 comment '$修改後註釋';
#修改後欄位位置
alter table table_name change [column] 現欄位1名 修改後欄位1名 修改後段1型別 comment '$修改後註釋' after 改為在某欄位後的某欄位名;

例項:

# 將性別(sex)名稱改成gender,型別改成int,註釋改成“性別”
alter table school_class_info change column sex gender int comment '性別';

# 將age欄位型別改為int,並將位置移動到name欄位後面
alter table school_class_info change age age int comment '使用者年齡' after name;

結果測試後報錯如下:

hive> alter table school_class_info change column sex gender int comment '性別';
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
gender

更改表中欄位的順序還是建議使用alter table table_name change column