1. 程式人生 > 實用技巧 >CDH5.12.1利用Sqoop將mysql資料匯入hive

CDH5.12.1利用Sqoop將mysql資料匯入hive

環境:CDH5.12.1 、 centos7

1、許可權問題

dfs.permissions 設定為false(可以在配置介面勾選掉)

2、關閉安全模式,允許讀寫

hdfs dfsadmin -safemode leave

3、建立hive表

drop table if exists default.opportunity;

create table default.opportunity(id BIGINT,consultant_account STRING,first_consultant_account STRING,group_id BIGINT,
first_group_id BIGINT,sale_department_id BIGINT,first_sale_department_id BIGINT,legion_id BIGINT, first_legion_id BIGINT,
business_id  BIGINT,student_id BIGINT,province_id BIGINT,city_id BIGINT,create_user STRING, online_group_id BIGINT,
online_center_id BIGINT, create_time TIMESTAMP,allocate_time TIMESTAMP,apply_time TIMESTAMP,  auto_apply_time TIMESTAMP
)ROW FORMAT DELIMITED FIELDS TERMINATED BY 
'\t';

4、sqoop全量匯入資料

sqoop import \
--connect jdbc:mysql://192.168.75.101:3306/dragnet \
--username root \
--password yang156122 \
--query 'select  id,consultant_account ,first_consultant_account,group_id, first_group_id,sale_department_id,first_sale_department_id,legion_id,first_legion_id,business_id,student_id,province_id,city_id,create_user,online_group_id,online_center_id,create_time,allocate_time,apply_time,auto_apply_time from opportunity where $CONDITIONS
' \ --target-dir /user/sqoop2/opportunity \ --delete-target-dir \ --num-mappers 1 \ --compress \ --compression-codec org.apache.hadoop.io.compress.SnappyCodec \ --direct \ --fields-terminated-by '\t'

5、sqoop增量匯入資料

a) id 增量

sqoop import --connect jdbc:mysql://192.168.75.101:3306/dragnet \
--username root \
--password yang156122 \ --table data \ --target-dir '/soft/hive/warehouse/data' \ --incremental append \ --check-column id \ --last-value 3 \ ##匯入id為3以後的匯入 -m 1

b) 增對已修改的資料,進行資料合併

sqoop import --connect jdbc:mysql://192.168.0.8:3306/hxy \
--username root \
--password 123456 \
--table data \
--target-dir '/soft/hive/warehouse/data' \
--check-column last_mod \
--incremental lastmodified  \
--last-value '2019-08-30 17:05:49' \
--m 1 \
--merge-key id

5、將資料匯入hive

load data inpath '/user/sqoop2/drag_opportunity' into table default.drag_opportunity ;

6、編寫定時任務,並重啟

/bin/systemctl restart  crond.service