Hive基礎(十九):面試題:如何用sqoop將hive中分割槽表的分割槽欄位匯入到MySQL中
阿新 • • 發佈:2020-12-22
問題分析:
- hive中分割槽表其底層就是HDFS中的多個目錄下的單個檔案,hive匯出資料本質是將HDFS中的檔案匯出
- hive中的分割槽表,因為分割槽欄位(靜態分割槽)不在檔案中,所以在sqoop匯出的時候,無法將分割槽欄位進行直接匯出
思路:在hive中建立一個臨時表,將分割槽表複製過去後分割槽欄位轉換為普通欄位,然後再用sqoop將tmp表匯出即實現需求
步湊如下:
1.建立目標表(分割槽表)
hive> CREATE TABLE `dept_partition`( `deptno` int, `dname` string, `loc` string) PARTITIONED BY (`month` string) row format delimited fields terminated by '\t';
1.1查看錶結構
hive> show create table dept_partition;
+----------------------------------------------------+--+ | createtab_stmt | +----------------------------------------------------+--+ | CREATE TABLE `dept_partition`( | | `deptno` int, | | `dname` string, | | `loc` string) | | PARTITIONED BY ( | | `month` string)
2.匯入資料
hive> load data inpath '/user/hive/hive_db/data/dept.txt' into table dept_partition;
10 ACCOUNTING 1700 20 RESEARCH 1800 30 SALES 1900 40 OPERATIONS 1700
3.查詢表dept_partition
hive> select * from dept_partition;
+------------------------+-----------------------+---------------------+-----------------------+--+ | dept_partition.deptno | dept_partition.dname | dept_partition.loc | dept_partition.month | +------------------------+-----------------------+---------------------+-----------------------+--+ | 10 | ACCOUNTING | 1700 | 2019-10-19 | | 20 | RESEARCH | 1800 | 2019-10-19 | | 30 | SALES | 1900 | 2019-10-19 | | 40 | OPERATIONS | 1700 | 2019-10-19 | | 10 | ACCOUNTING | 1700 | 2019-10-20 | | 20 | RESEARCH | 1800 | 2019-10-20 | | 30 | SALES | 1900 | 2019-10-20 | | 40 | OPERATIONS | 1700 | 2019-10-20 | +------------------------+-----------------------+---------------------+-----------------------+--+
4.建立臨時表 tmp_dept_partition
hive> create table tmp_dept_partition as select * from dept_partition;
5.查詢臨時表
hive> select * from tmp_dept_partition;
+----------------------------+---------------------------+-------------------------+---------------------------+--+ | tmp_dept_partition.deptno | tmp_dept_partition.dname | tmp_dept_partition.loc | tmp_dept_partition.month | +----------------------------+---------------------------+-------------------------+---------------------------+--+ | 10 | ACCOUNTING | 1700 | 2019-10-19 | | 20 | RESEARCH | 1800 | 2019-10-19 | | 30 | SALES | 1900 | 2019-10-19 | | 40 | OPERATIONS | 1700 | 2019-10-19 | | 10 | ACCOUNTING | 1700 | 2019-10-20 | | 20 | RESEARCH | 1800 | 2019-10-20 | | 30 | SALES | 1900 | 2019-10-20 | | 40 | OPERATIONS | 1700 | 2019-10-20 | +----------------------------+---------------------------+-------------------------+---------------------------+--+
6.查看錶結構(這個時候分割槽表已經轉換為非分割槽表了)
hive> show create table tmp_dept_partition;
+----------------------------------------------------+--+ | createtab_stmt | +----------------------------------------------------+--+ | CREATE TABLE `tmp_dept_partition`( | | `deptno` int, | | `dname` string, | | `loc` string, | | `month` string)
7.MySQL中建表 dept_partition
mysql> drop table if exists dept_partition; create table dept_partition( `deptno` int, `dname` varchar(20), `loc` varchar(20), `month` varchar(50))
8.使用sqoop匯入到MySQL
bin/sqoop export \ --connect jdbc:mysql://hadoop01:3306/partitionTb \ --username root \ --password 123456 \ --table dept_partition \ --num-mappers 1 \ --export-dir /user/hive/warehouse/hive_db.db/tmp_dept_partition \ --input-fields-terminated-by "\001"
9.Mysql查詢驗證是否成功匯出
mysql> select * from dept_partition;
+--------+------------+------+------------+ | deptno | dname | loc | month | +--------+------------+------+------------+ | 10 | ACCOUNTING | 1700 | 2019-10-19 | | 20 | RESEARCH | 1800 | 2019-10-19 | | 30 | SALES | 1900 | 2019-10-19 | | 40 | OPERATIONS | 1700 | 2019-10-19 | | 10 | ACCOUNTING | 1700 | 2019-10-20 | | 20 | RESEARCH | 1800 | 2019-10-20 | | 30 | SALES | 1900 | 2019-10-20 | | 40 | OPERATIONS | 1700 | 2019-10-20 | +--------+------------+------+------------+