用sqoop匯入資料到HIVE和HDFS中

阿新 • • 發佈：2019-02-19

一：sqoop資料的匯入 1.使用sqoop將：mysql中的資料匯入到HDFS（直接匯入） Step1、確定Mysql服務的正常開啟 service mysql status Step2、在Mysql中建立一張表 mysql> create database company mysql> create table staff( id int(4) primary key not null auto_increment,

name varchar(255) not null, sex varchar(255) not null); mysql> insert into staff(name, sex) values('Thomas', 'Male'); Step3、使用Sqoop匯入資料到HDFS 1. 將mysql資料全部匯入hdfs

bin/sqoop import \ --connect jdbc:mysql://hadoop102:3306/company \ --username root \ --password 123456 \ --table staff \ --target-dir /user/company \

--delete-target-dir \ --num-mappers 1 \ = --m 1 --fields-terminated-by "\t" //如果到這裡結束，則預設存放在hdfs上路徑為：/user/robot/staff/ ,預設資料用,逗號隔開 2.查詢mysql總部分資料匯入hdfs中 $ bin/sqoop import --connect jdbc:mysql://hadoop-senior01.itguigu.com:3306/company --username root --password 123456 --target-dir /user/company --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --query 'select name,sex from staff where id >= 2 and $CONDITIONS;' 3.也算查詢匯入，匯入指定列 $ bin/sqoop import --connect jdbc:mysql://hadoop-senior01.robot.com:3306/company --username root --password 123456 --target-dir /user/company --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --columns id, sex --table staff 4.細化篩選查詢匯入 $ bin/sqoop import --connect jdbc:mysql://hadoop-senior01.robot.com:3306/company --username root --password 123456 --target-dir /user/company --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --table staff --where "id=3" 2使用sqoop將：mysql中的資料匯入到HIVE中 實際上先匯入到hdfs中，然後在hive中建立一個表，不需要提前建立表，然後把資料來源Load進表中。也可以提前建立表，匯入到指定的表中 bin/sqoop import\ --connect jdbc:mysql://hadoop102:3306/company \ --username root\ --password 123456\ --table staff\ --target-dir /user/company \ --hive-import\ --m 1\ --fields-terminated-by "\t" --hive-table company.staff_hive（針對提前建立表，把資料匯入的指定表staff_hive中的，staff_hive是否建立無所謂。） --hive-overwrite 過程詳解： 1.底層的mapreduce先將檔案上傳到hdfs對應的目錄，/user/company/下（在執行期間可以在這個目錄下看到上傳的檔案） 2.然後hive在根據資料來源建立對應的表 3.將hdfs的資料來源匯入load進hive的表中。（這時hdfs上的檔案因為匯入到hive表中，而沒有了。新的檔案在hive/warehouse目錄下） 4.用bin/hive select * from company.staff可以檢視資料，或者在user/hive/warehouse目錄下檢視。 二： Sqoop的資料匯出 1. 將資料從HDFS匯出到RDBMS資料庫 - 匯出前，目標表必須存在於目標資料庫中。 - 預設操作是從將檔案中的資料使用INSERT語句插入到表中 - 更新模式下，是生成UPDATE語句更新表資料 1.首先資料庫中根據hdfs表中的屬性建立空表 2.然後執行匯出命令 3.驗證表mysql命令列。舉例： 0.資料是在HDFS 中“EMP/”目錄的emp_data檔案中。 1.根據hdfs中屬性建表 mysql> USE test; mysql> CREATE TABLE employee ( id INT NOT NULL PRIMARY KEY, name VARCHAR(20), deg VARCHAR(20), salary INT, dept VARCHAR(10)); 2.執行匯出命令 bin/sqoop export \ --connect jdbc:mysql://hadoop102/test\ --username root \ --password root \ --table employee \ --export-dir /user/hadoop/emp/ 3,在mysql裡檢查：mysql>select * from employee; 四:sqoop對應的指令碼的檔案是***.opt檔案 使用opt檔案打包sqoop命令，然後執行（sqoop指令碼命令引數必須分行） Step1、建立一個.opt檔案 Step2、編寫sqoop指令碼 export --connect jdbc:mysql://hadoop-senior01.robot.com:3306/company --username root --password 123456 --table staff_mysql --num-mappers 1 --export-dir /user/hive/warehouse/company.db/staff_hive --input-fields-terminated-by "\t" Step3、執行該指令碼 $ bin/sqoop --options-file opt/job_hffs2rdbms.opt

用sqoop匯入資料到HIVE和HDFS中

用sqoop匯入資料到HIVE和HDFS中

Hive從HDFS中載入資料

用excl匯入資料檔案並用matplotlib畫箱線圖和提琴圖

sparksql讀取hive中的資料儲存到hdfs中

Sqoop匯入資料文字格式和Sequence File的比較

工作中，sqoop匯入匯出hive，mysql 出現的問題.

sqoop工具在hive和mysql之間互相導資料

TensorFlow走過的坑之---資料讀取和tf中batch的使用方法

資料結構和記憶體中堆和棧的區別

Verilog用$readmemh匯入資料

學習大資料有什麼用？大資料當前和未來的優勢是什麼？

用自己的資料訓練和測試“caffenet”

（薛開宇學習筆記（三））用自己的資料訓練和測試“CaffeNet”

通過shell指令碼做定時任務，用sqoop匯出資料

Hive和sparksql中的dayofweek

Sqoop 匯入資料報錯：No columns to generate for ClassWriter

sqoop匯入資料時間日期型別錯誤

sqoop 匯入資料的時候出現Unsupported major.minor version 52.0的問題描述

使用neo4j圖資料庫的import工具匯入資料 -方法和注意事項

[JAVAWEB]4.用jQuery完成資料驗證和表單提交

用sqoop匯入資料到HIVE和HDFS中

相關推薦