Hadoop基礎（四十六）：DML 資料操作

阿新 • • 發佈：2020-07-22

1 資料匯入

1.1 向表中裝載資料（Load） 1．語法 hive> load data [local] inpath '/opt/module/datas/student.txt' [overwrite] into table student [partition (partcol1=val1,…)]; （1）load data:表示載入資料（2）local:表示從本地載入資料到 hive 表；否則從 HDFS 載入資料到 hive 表（3）inpath:表示載入資料的路徑（4）overwrite:表示覆蓋表中已有資料，否則表示追加（5）into table:表示載入到哪張表（6）student:表示具體的表（7）partition:表示上傳到指定分割槽 2．實操案例（0）建立一張表

hive (default 
)> create table student(id string, name string) row format delimited 
fields terminated by '\t';

（1）載入本地檔案到 hive

hive (default)> load data local inpath '/opt/module/datas/student.txt' into table 
default.student;

（2）載入 HDFS 檔案到 hive 中上傳檔案到 HDFS

hive (default)> dfs -put /opt/module/datas/student.txt /user/atguigu/hive;

載入 HDFS 上資料

hive (default)> load data inpath '/user/atguigu/hive/student.txt' into table 
default.student;

（3）載入資料覆蓋表中已有的資料上傳檔案到 HDFS

hive (default)> dfs -put /opt/module/datas/student.txt /user/atguigu/hive;

載入資料覆蓋表中已有的資料

hive (default)> load data inpath '/user/atguigu/hive/student.txt' overwrite into 
table  
default.student;

1.2 通過查詢語句向表中插入資料（Insert） 1．建立一張分割槽表

hive (default)> create table student(id int, name string) partitioned by (month 
string) row format delimited fields terminated by '\t';

2．基本插入資料

hive (default)> insert into table student partition(month='201709') 
values(1,'wangwu'),(2,’zhaoliu’);

3．基本模式插入（根據單張表查詢結果）

hive (default)> insert overwrite table student partition(month='201708')
 select id, name from student where month='201709';

insert into：以追加資料的方式插入到表或分割槽，原有資料不會刪除 insert overwrite：會覆蓋表或分割槽中已存在的資料注意：insert 不支援插入部分欄位 4．多表（多分割槽）插入模式（根據多張表查詢結果）

hive (default)> from student
 insert overwrite table student partition(month='201707')
 select id, name where month='201709'
 insert overwrite table student partition(month='201706')
 select id, name where month='201709';

1.3 查詢語句中建立表並載入資料（As Select） 詳見 4.5.1 章建立表。根據查詢結果建立表（查詢的結果會新增到新建立的表中）

create table if not exists student3
as select id, name from student;

1.4 建立表時通過 Location 指定載入資料路徑 1．上傳資料到 hdfs 上

hive (default)> dfs -mkdir /student;
hive (default)> dfs -put /opt/module/datas/student.txt /student;

2. 建立表，並指定在 hdfs 上的位置

hive (default)> create external table if not exists student5(
 id int, name string
 )
 row format delimited fields terminated by '\t'
 location '/student;

3．查詢資料

hive (default)> select * from student5;

1.5 Import 資料到指定 Hive 表中 注意：先用 export 匯出後，再將資料匯入。

hive (default)> import table student2 partition(month='201709') from
'/user/hive/warehouse/export/student';

2 資料匯出

2.1 Insert 匯出 1．將查詢的結果匯出到本地

hive (default)> insert overwrite local directory 
'/opt/module/datas/export/student'
 select * from student;

2．將查詢的結果格式化匯出到本地

hive(default)>insert overwrite local directory 
'/opt/module/datas/export/student1'
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' select * from 
student;

3．將查詢的結果匯出到 HDFS 上(沒有 local)

hive (default)> insert overwrite directory '/user/atguigu/student2'
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 
 select * from student;

2.2 Hadoop 命令匯出到本地

hive (default)> dfs -get /user/hive/warehouse/student/month=201709/000000_0
/opt/module/datas/export/student3.txt;

2.3 Hive Shell 命令匯出 基本語法：（hive -f/-e 執行語句或者指令碼 > file）

[atguigu@hadoop102 hive]$ bin/hive -e 'select * from default.student;' >
/opt/module/datas/export/student4.txt;

2.4 Export 匯出到 HDFS 上

(defahiveult)> export table default.student to
'/user/hive/warehouse/export/student';

export 和 import 主要用於兩個 Hadoop 平臺叢集之間 Hive 表遷移。 2.5 Sqoop 匯出

3 清除表中資料

注意：Truncate 只能刪除管理表，不能刪除外部表中資料

hive (default)> truncate table student;

Hadoop基礎（四十六）：DML 資料操作

1 資料匯入

2 資料匯出

3 清除表中資料

Hadoop基礎（四十六）：DML 資料操作

Hadoop基礎（三十六）：監聽伺服器節點動態上下線案例

Hadoop基礎（四十七）：查詢

Hadoop基礎（四十八）：函式

Hadoop基礎（四十九）：壓縮和儲存（一）

Hadoop基礎（五十六）：其他面試題手寫Hadoop WordCount

Hadoop基礎（二十五）：OutputFormat資料輸出

Hadoop基礎（二十一）：Shuffle機制（二）

Hadoop基礎（二十二）：Shuffle機制（三）

Hadoop基礎（二十九）：資料清洗（ETL）（二）複雜解析版

Hadoop基礎（二十八）：資料清洗（ETL）（一）簡單解析版

Hadoop基礎（三十五）：Zookeeper API 應用客戶端模式

Hadoop基礎（五十一）：企業級調優（一）

Hadoop基礎（五十二）：企業級調優（二）

Flink基礎（二十六）：FLINK SQL(二)查詢語句（二）操作符（一）

Flink基礎（三十六）：FLINK SQL(十二) 函式（一）概述

Flink基礎（四十一）：FLINK SQL(十七)Catalogs

Flink基礎（四十二）：FLINK SQL(十八) 配置

Hadoop基礎（五十七）：其他面試題ES（二）

Hadoop基礎（五十八）：其他面試題ES（三）

Hadoop基礎（四十六）：DML 資料操作

1 資料匯入

2 資料匯出

3 清除表中資料

相關推薦