sqoop匯入匯出表/資料到Hive

阿新 • • 發佈：2019-01-22

筆記:

將關係型資料庫表及表中的資料複製到hive中
sqoop import : RMDBS——>hive
語法：
sqoop import --connect jdbc:mysql://IP:PORT/database --username root --password PWD --table tablename --hive-import --hive-table hivetable -m 1

[[email protected] src]$ sqoop import --connect jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8 --username root --password root --table t_user --hive-import --hive-table hivetest1 -m 1

以下是返回結果:

16/04/22 15:16:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
16/04/22 15:16:48 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/04/22 15:16:48 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/04/22 15:16:48 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/04/22 15:16:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/04/22 15:16:48 INFO tool.CodeGenTool: Beginning code generation
16/04/22 15:16:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 1
16/04/22 15:16:49 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 1
16/04/22 15:16:49 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/src/hadoop-2.5.0
注: /tmp/sqoop-hadoop/compile/14d08d5f7ecc890e985eeb17942619db/t_user.java使用或覆蓋了已過時的 API。
注: 有關詳細資訊, 請使用 -Xlint:deprecation 重新編譯。
16/04/22 15:16:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/14d08d5f7ecc890e985eeb17942619db/t_user.jar
16/04/22 15:16:54 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/04/22 15:16:54 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/04/22 15:16:54 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/04/22 15:16:54 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/04/22 15:16:54 INFO mapreduce.ImportJobBase: Beginning import of t_user
16/04/22 15:16:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/04/22 15:16:55 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/04/22 15:16:56 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/04/22 15:16:56 INFO client.RMProxy: Connecting to ResourceManager at nameNode/192.168.2.246:8032
16/04/22 15:17:00 INFO db.DBInputFormat: Using read commited transaction isolation
16/04/22 15:17:00 INFO mapreduce.JobSubmitter: number of splits:1
16/04/22 15:17:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461290770642_0013
16/04/22 15:17:01 INFO impl.YarnClientImpl: Submitted application application_1461290770642_0013
16/04/22 15:17:01 INFO mapreduce.Job: The url to track the job: http://nameNode:8088/proxy/application_1461290770642_0013/
16/04/22 15:17:01 INFO mapreduce.Job: Running job: job_1461290770642_0013
16/04/22 15:17:16 INFO mapreduce.Job: Job job_1461290770642_0013 running in uber mode : false
16/04/22 15:17:16 INFO mapreduce.Job:  map 0% reduce 0%
16/04/22 15:17:25 INFO mapreduce.Job:  map 100% reduce 0%
16/04/22 15:17:25 INFO mapreduce.Job: Job job_1461290770642_0013 completed successfully
16/04/22 15:17:25 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=118487
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=87
                HDFS: Number of bytes written=1111
                HDFS: Number of read operations=4
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters 
                Launched map tasks=1
                Other local map tasks=1
                Total time spent by all maps in occupied slots (ms)=6570
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=6570
                Total vcore-seconds taken by all map tasks=6570
                Total megabyte-seconds taken by all map tasks=6727680
        Map-Reduce Framework
                Map input records=11
                Map output records=11
                Input split bytes=87
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=95
                CPU time spent (ms)=1420
                Physical memory (bytes) snapshot=107536384
                Virtual memory (bytes) snapshot=2064531456
                Total committed heap usage (bytes)=30474240
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=1111
16/04/22 15:17:25 INFO mapreduce.ImportJobBase: Transferred 1.085 KB in 29.4684 seconds (37.7015 bytes/sec)
16/04/22 15:17:25 INFO mapreduce.ImportJobBase: Retrieved 11 records.
16/04/22 15:17:25 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_user` AS t LIMIT 1
16/04/22 15:17:25 WARN hive.TableDefWriter: Column create_date had to be cast to a less precise type in Hive
16/04/22 15:17:26 INFO hive.HiveImport: Loading uploaded data into Hive
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/04/22 15:17:28 INFO hive.HiveImport: 16/04/22 15:17:28 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/04/22 15:17:29 INFO hive.HiveImport: 
16/04/22 15:17:29 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/home/hadoop/hadoop/src/hive-0.12.0-bin/lib/hive-common-0.12.0.jar!/hive-log4j.properties
16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/src/hadoop-2.5.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/src/hive-0.12.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
16/04/22 15:17:29 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/04/22 15:17:39 INFO hive.HiveImport: OK
16/04/22 15:17:39 INFO hive.HiveImport: Time taken: 8.927 seconds
16/04/22 15:17:39 INFO hive.HiveImport: Loading data to table default.hivetest1
16/04/22 15:17:39 INFO hive.HiveImport: Table default.hivetest1 stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 1111, raw_data_size: 0]
16/04/22 15:17:39 INFO hive.HiveImport: OK
16/04/22 15:17:39 INFO hive.HiveImport: Time taken: 0.789 seconds
16/04/22 15:17:40 INFO hive.HiveImport: Hive import complete.
16/04/22 15:17:40 INFO hive.HiveImport: Export directory is empty, removing it.

顯示這樣說明OK Sqoop匯入到Hive中成功

查詢Hive結果

hive> <span style="color:#ff0000;">show databases;</span>
OK
default
Time taken: 6.138 seconds, Fetched: 1 row(s)
hive><span style="color:#ff0000;"> use default;</span>
OK
Time taken: 0.038 seconds
hive> <span style="color:#ff0000;">show tables;</span>
OK
hivetest1
test
Time taken: 0.336 seconds, Fetched: 2 row(s)
hive> select * from hivetest1;
OK
6       2014-01-12 13:36:19.0    
[email protected]   admin   f6fdffe48c908deb0f4c3bd36c032e72        1234567890      1       admin
45      2015-04-01 23:55:52.0   [email protected] internet4       6512bd43d9caa6e02c990b0a82652dca        134567890       1       1
46      2015-04-02 17:10:18.0   [email protected]   internet3       b59c67bf196a4758191e42f76670ceba        1234567890      1       111
47      2015-04-05 00:43:11.0   [email protected]      lm      94a4ebeb834f927228209c2dd71c10e5        1234567890      1       lm
48      2015-04-05 03:39:19.0   [email protected]   internet1       c6c73fee8fd9af7d3f1abd1611a40b32        123456790-      1       internet1
49      2015-04-05 03:39:53.0   [email protected]   internet2       79e5b03556b7e87b508b5977d5f3f7e0        internet2       1       internet2
50      2015-04-05 03:41:50.0   [email protected]      internet5       9ac010daa64dc3065b7b6eb584d231ae        123456780       1       internet5
51      2015-04-05 03:42:24.0   [email protected]   internet6       1614782b1a29db28ce58ce1173d97ee0        123456780       1       internet6
52      2015-04-05 03:42:55.0   [email protected]      internet7       6f3edab07447258b53b2f821187f6f1f        123456780       1       internet7
53      2015-04-05 03:43:21.0   [email protected]   internet8       c7ef3b5c4827d840d0c264a1c06f1717        123456780       1       internet8
54      2015-04-05 03:43:53.0   [email protected]   internet9       33c7785a99eeda46b8feb7a18b8dc45 123456780       1       internet9
Time taken: 3.335 seconds, Fetched: 11 row(s)
hive>

hive-------->mysql

1.查詢hive test表中的資料
hive> select * from test;
OK
1 iPhone6s 132G 4700
2 iPhone6s 016G 4000
3 iPhone6s 12G 2400
4 iPhone5s 132G 400
5 iPhone4s 132G 400
6 iPhone7s 132G 400
7 xiaomiNoto 016G 700
8 huaweiX7 400G 7
9 sanxingX7s 100G 2
10 xiaomiH2 132G 5000
2.Hive儲存在HDFS上的元資料
[[email protected] ~]$ hadoop fs -cat /user/hive/warehouse/test/ce.txt
16/04/22 15:44:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1 iPhone6s 132G 4700
2 iPhone6s 016G 4000
3 iPhone6s 12G 2400
4 iPhone5s 132G 400
5 iPhone4s 132G 400
6 iPhone7s 132G 400
7 xiaomiNoto 016G 700
8 huaweiX7 400G 7
9 sanxingX7s 100G 2
10 xiaomiH2 132G 5000
3.將Hive中的表匯入到mysql中
使用:sqoop export --connect jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8 --username root --password root --table test --export-dir /user/hive/warehouse/test/ce.txt

出現bug

Error: java.io.IOException: Can't export data, please check failed map task logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.RuntimeException: Can't parse input data: '1       iPhone6s        132G    4700'
        at test.__loadFromFields(test.java:335)
        at test.parse(test.java:268)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more
Caused by: java.lang.NumberFormatException: For input string: "1        iPhone6s        132G    4700"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:580)
        at java.lang.Integer.valueOf(Integer.java:766)
        at test.__loadFromFields(test.java:317)
        ... 12 more

16/04/22 16:04:27 INFO mapreduce.Job: Task Id : attempt_1461290770642_0018_m_000001_0, Status : FAILED
Error: java.io.IOException: Can't export data, please check failed map task logs
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.RuntimeException: Can't parse input data: '6       iPhone7s        132G    400'
        at test.__loadFromFields(test.java:335)
        at test.parse(test.java:268)
        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
        ... 10 more
Caused by: java.lang.NumberFormatException: For input string: "6        iPhone7s        132G    400"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:580)
        at java.lang.Integer.valueOf(Integer.java:766)
        at test.__loadFromFields(test.java:317)
        ... 12 more

更改為:sqoop export --connect "jdbc:mysql://192.168.2.251:3306/from_66internet?characterEncoding=UTF-8" --username root --password root --table test --input-fields-terminated-by '\t' --export-dir '/user/hive/warehouse/test/ce.txt' -m 2

檢視資料是否匯入mysql表中

Sqoop筆記

sqoop匯入匯出表/資料到Hive

以下是返回結果:

hive-------->mysql

出現bug

sqoop匯入匯出表/資料到Hive

MySQL匯入匯出表結構和資料

使用EXPDP/IMPDP匯入匯出表中資料/元資料測試

大資料Sqoop系列之Sqoop匯入匯出資料

plsql developer如何匯入匯出表結構和資料以及如何複製表結構和資料？

sqoop 匯入mysql資料到hive中，把多個mysql欄位資料型別轉換hive資料型別

工作中，sqoop匯入匯出hive，mysql 出現的問題.

Oracle匯入匯出表結構和表資料

plsql匯入匯出表結構和資料物件

plsql 匯入匯出表、資料、序列、檢視

mysql匯入匯出指定資料指令碼(含遠端)及弊端

mysql匯入匯出全部資料指令碼(含遠端)及錯誤收集

Laravel使用反向migrate 和 iseed擴充套件匯出表資料

PE檔案學習筆記（一）---匯入匯出表

MySQL mysqldump 匯入/匯出結構&資料&儲存過程&函式&事件&觸發器

MongoDB 匯入匯出和資料遷移

Oracle匯入匯出表

利用navicat匯入excel表資料到資料庫

5.非關係型資料庫（Nosql）之mongodb：建立集合，備份與匯入匯出，資料還原，匯入匯出

mysql匯出表資料到檔案的幾種方法

sqoop匯入匯出表/資料到Hive

以下是返回結果:

hive-------->mysql

出現bug

相關推薦