1. 程式人生 > >hive新增分割槽欄位

hive新增分割槽欄位

靜態分割槽表:

一級分割槽表:

複製程式碼
CREATE TABLE order_created_partition (
    orderNumber STRING
  , event_time  STRING
)
PARTITIONED BY (event_month string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
複製程式碼

載入資料方式一:從本地/HDFS目錄載入

load data local inpath '/home/spark/software/data/order_created.txt' overwrite into table order_created_partition PARTITION(event_month=
'2014-05');
select * from order_created_partition where event_month='2014-05';
複製程式碼
+-----------------+-----------------------------+--------------+
|   ordernumber   |         event_time          | event_month  |
+-----------------+-----------------------------+--------------+
| 10703007267488  | 2014-05-01 06
:01:12.334+01 | 2014-05 | | 10101043505096 | 2014-05-01 07:28:12.342+01 | 2014-05 | | 10103043509747 | 2014-05-01 07:50:12.33+01 | 2014-05 | | 10103043501575 | 2014-05-01 09:27:12.33+01 | 2014-05 | | 10104043514061 | 2014-05-01 09:03:12.324+01 | 2014-05 | +-----------------+-----------------------------+--------------+
複製程式碼

載入資料方式二:手工上傳檔案到hdfs上,然後將資料新增到分割槽表指定的分割槽:

1) 建立hdfs目錄:在hdfs目錄:/user/hive/warehouse/order_created_partition目錄下建立event_month=2014-06

hadoop fs -mkdir /user/hive/warehouse/order_created_partition/event_month=2014-06

2)拷貝資料到新建立的目錄下:

hadoop fs -put /home/spark/software/data/order_created.txt /user/hive/warehouse/order_created_partition/event_month=2014-06

select * from order_created_partition where event_month='2014-06'; #發現查詢結果是空的

3)新增新分割槽資料到元資料資訊中:

msck repair table order_created_partition;

輸出日誌資訊:

Partitions not in metastore: order_created_partition:event_month=2014-06
Repair: Added partition to metastore order_created_partition:event_month=2014-06

或者: alter table order_created_partition add partition(dt='2014-06');

select * from order_created_partition where event_month='2014-06'; 
複製程式碼
+-----------------+-----------------------------+--------------+
|   ordernumber   |         event_time          | event_month  |
+-----------------+-----------------------------+--------------+
| 10703007267488  | 2014-05-01 06:01:12.334+01  | 2014-06      |
| 10101043505096  | 2014-05-01 07:28:12.342+01  | 2014-06      |
| 10103043509747  | 2014-05-01 07:50:12.33+01   | 2014-06      |
| 10103043501575  | 2014-05-01 09:27:12.33+01   | 2014-06      |
| 10104043514061  | 2014-05-01 09:03:12.324+01  | 2014-06      |
+-----------------+-----------------------------+--------------+
複製程式碼

載入資料方式三:select查詢方式insert/overwrite

CREATE TABLE order_created_4_partition (
    orderNumber STRING
  , event_time  STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data local inpath '/home/spark/software/data/order_created.txt' overwrite into table order_created_4_partition;

insert into table order_created_partition partition(event_month='2014-07') select * from order_created_4_partition;
insert overwrite table order_created_partition partition(event_month='2014-07') select * from order_created_4_partition;

對比:

insert overwrite table order_created_partition partition(event_month='2014-07') select ordernumber,event_time from order_created_4_partition;
insert overwrite table order_created_partition partition(event_month='2014-07') select event_time,ordernumber from order_created_4_partition;

發現欄位值錯位,在使用時一定要注意:欄位值順序要與表中欄位順序一致,名稱可以不一致;

檢視分割槽表已有的所有分割槽:

show partitions order_created_partition;

檢視分割槽表已有的指定分割槽:

SHOW PARTITIONS order_created_partition PARTITION(event_month='2014-06');

查看錶欄位資訊:

desc order_created_partition;
desc extended order_created_partition;
desc formatted order_created_partition;
desc formatted order_created_partition partition(event_month='2014-05');

二級分割槽表:

複製程式碼
CREATE TABLE order_created_partition2 (
    orderNumber STRING
  , event_time  STRING
)
PARTITIONED BY (event_month string, step string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
複製程式碼
show partitions order_created_partition2;

顯示結果空

load data local inpath '/home/spark/software/data/order_created.txt' into table order_created_partition2 partition(event_month='2014-09',step='1');  
show partitions order_created_partition2;
+-----------------------------+
|           result            |
+-----------------------------+
| event_month=2014-09/step=1  |
+-----------------------------+
insert overwrite table order_created_partition2 partition(event_month='2014-09',step='2') select * from order_created_4_partition;
show partitions order_created_partition2;
複製程式碼
+-----------------------------+
|           result            |
+-----------------------------+
| event_month=2014-09/step=1  |
| event_month=2014-09/step=2  |
+-----------------------------+
複製程式碼

動態分割槽表

複製程式碼
CREATE TABLE order_created_dynamic_partition (
    orderNumber STRING
  , event_time  STRING
)
PARTITIONED BY (event_month string)
;
複製程式碼
insert into table order_created_dynamic_partition PARTITION (event_month)
select orderNumber, event_time, substr(event_time, 1, 7) as event_month from order_created;

報錯:

FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. 
To turn this off set hive.exec.dynamic.partition.mode=nonstrict

解決方案:

set hive.exec.dynamic.partition.mode=nonstrict;

重新執行:

insert into table order_created_dynamic_partition PARTITION (event_month)
select orderNumber, event_time, substr(event_time, 1, 7) as event_month from order_created;
select * from order_created_dynamic_partition;
複製程式碼
+-----------------+-----------------------------+--------------+
|   ordernumber   |         event_time          | event_month  |
+-----------------+-----------------------------+--------------+
| 10703007267488  | 2014-05-01 06:01:12.334+01  | 2014-05      |
| 10101043505096  | 2014-05-01 07:28:12.342+01  | 2014-05      |
| 10103043509747  | 2014-05-01 07:50:12.33+01   | 2014-05      |
| 10103043501575  | 2014-05-01 09:27:12.33+01   | 2014-05      |
| 10104043514061  | 2014-05-01 09:03:12.324+01  | 2014-05      |
+-----------------+-----------------------------+--------------+
複製程式碼