CDH5.12.1利用Sqoop將mysql資料匯入hive
阿新 • • 發佈:2020-10-23
環境:CDH5.12.1 、 centos7
1、許可權問題
dfs.permissions 設定為false(可以在配置介面勾選掉)
2、關閉安全模式,允許讀寫
hdfs dfsadmin -safemode leave
3、建立hive表
drop table if exists default.opportunity; create table default.opportunity(id BIGINT,consultant_account STRING,first_consultant_account STRING,group_id BIGINT, first_group_id BIGINT,sale_department_id BIGINT,first_sale_department_id BIGINT,legion_id BIGINT, first_legion_id BIGINT, business_id BIGINT,student_id BIGINT,province_id BIGINT,city_id BIGINT,create_user STRING, online_group_id BIGINT, online_center_id BIGINT, create_time TIMESTAMP,allocate_time TIMESTAMP,apply_time TIMESTAMP, auto_apply_time TIMESTAMP )ROW FORMAT DELIMITED FIELDS TERMINATED BY'\t';
4、sqoop全量匯入資料
sqoop import \ --connect jdbc:mysql://192.168.75.101:3306/dragnet \ --username root \ --password yang156122 \ --query 'select id,consultant_account ,first_consultant_account,group_id, first_group_id,sale_department_id,first_sale_department_id,legion_id,first_legion_id,business_id,student_id,province_id,city_id,create_user,online_group_id,online_center_id,create_time,allocate_time,apply_time,auto_apply_time from opportunity where $CONDITIONS' \ --target-dir /user/sqoop2/opportunity \ --delete-target-dir \ --num-mappers 1 \ --compress \ --compression-codec org.apache.hadoop.io.compress.SnappyCodec \ --direct \ --fields-terminated-by '\t'
5、sqoop增量匯入資料
a) id 增量
sqoop import --connect jdbc:mysql://192.168.75.101:3306/dragnet \ --username root \--password yang156122 \ --table data \ --target-dir '/soft/hive/warehouse/data' \ --incremental append \ --check-column id \ --last-value 3 \ ##匯入id為3以後的匯入 -m 1
b) 增對已修改的資料,進行資料合併
sqoop import --connect jdbc:mysql://192.168.0.8:3306/hxy \ --username root \ --password 123456 \ --table data \ --target-dir '/soft/hive/warehouse/data' \ --check-column last_mod \ --incremental lastmodified \ --last-value '2019-08-30 17:05:49' \ --m 1 \ --merge-key id
5、將資料匯入hive
load data inpath '/user/sqoop2/drag_opportunity' into table default.drag_opportunity ;
6、編寫定時任務,並重啟
/bin/systemctl restart crond.service