1. 程式人生 > >大資料協作框架之Sqoop

大資料協作框架之Sqoop

一、概述:

    1、Sqoop:SQL-to-Hadoop

    2、連線傳統關係型資料庫和Hadoop的橋樑:

     a、把關係型資料庫的資料匯入到Hadoop與其相關的系統中(如Hive,Hbase)

     b、把資料從Hadoop系統裡抽取並匯出到關係型資料庫裡

   3、利用MapReduce加快資料傳輸速度,Sqoop中只有Map沒有reduce。

二、安裝sqoop

1、解壓:

tar -zxvf sqoop-1.4.6-cdh5.14.2.tar.gz -C /opt/cdh5.14.2/

2、配置

檢視sqoop的conf目錄:

將 sqoop-env-template.sh重新命名為:sqoop-env.sh。

配置sqoop-env.sh:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/opt/cdh5.14.2/hadoop-2.6.0

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/opt/cdh5.14.2/hadoop-2.6.0

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
export HIVE_HOME=/opt/cdh5.14.2/hive-1.1.0

#Set the path for where zookeper config dir is
#export ZOOCFGDIR=

3、由於涉及到mysql的連線,需要將mysql的驅動包拷貝到sqoop的lib目錄下

cp mysql-connector-java-5.1.27-bin.jar /opt/cdh5.14.2/sqoop-1.4.6/lib/

 

三、具體操作:

1、list-databases

bin/sqoop list-databases --connect jdbc:mysql://master.cdh.com:3306 --username root --password 123456

結果:

[[email protected] sqoop-1.4.6]# bin/sqoop list-databases --connect jdbc:mysql://master.cdh.com:3306 --username root --password 123456
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/07/28 21:53:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.14.2
18/07/28 21:53:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/28 21:53:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
cdhmetastore
mysql
test

2、匯入資料:

從資料庫中匯入資料導HDFS

結果發生瞭如下的報錯:

Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/07/30 05:29:44 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.14.2
18/07/30 05:29:44 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/30 05:29:44 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/07/30 05:29:44 INFO tool.CodeGenTool: Beginning code generation
18/07/30 05:29:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user_info` AS t LIMIT 1
18/07/30 05:29:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user_info` AS t LIMIT 1
18/07/30 05:29:46 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cdh5.14.2/hadoop-2.6.0
Note: /tmp/sqoop-root/compile/fd915b0f18440da413291558f8b3fcd5/tb_user_info.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/07/30 05:29:57 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/fd915b0f18440da413291558f8b3fcd5/tb_user_info.jar
18/07/30 05:29:57 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/07/30 05:29:57 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/07/30 05:29:57 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/07/30 05:29:57 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/07/30 05:29:57 INFO mapreduce.ImportJobBase: Beginning import of tb_user_info
Exception in thread "main" java.lang.NoClassDefFoundError: org/json/JSONObject
	at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43)
	at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:784)
	at org.apache.sqoop.mapreduce.JobBase.putSqoopOptionsToConfiguration(JobBase.java:392)
	at org.apache.sqoop.mapreduce.JobBase.createJob(JobBase.java:378)
	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:256)
	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
	at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:513)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.json.JSONObject
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 15 more

這是因為缺少了java-json.jar包.

https://blog.csdn.net/lichangzai/article/details/51601807

下載地址:http://www.java2s.com/Code/Jar/j/Downloadjavajsonjar.htm

把jar包放到sqoop/lib目錄下:

cp java-json.jar /opt/cdh5.14.2/sqoop-1.4.6/lib/

再次執行import命令,不再報錯。

在HDFS上相應的目錄可以看到生成了檔案:

檢視檔案內容,正是所匯入的內容:

 

3、匯入資料使用query:

檢視結果:

 

 

4、匯入資料使用direct:

發現會報錯,如下所示:

 

Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:36 INFO mapreduce.Job:  map 25% reduce 0%
18/07/30 11:29:39 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000000_0, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:40 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000003_0, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:46 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000001_1, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:57 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000000_1, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:04 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000003_1, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:07 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000001_2, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:09 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000000_2, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:15 INFO mapreduce.Job:  map 75% reduce 0%
18/07/30 11:30:16 INFO mapreduce.Job:  map 100% reduce 0%
18/07/30 11:30:16 INFO mapreduce.Job: Job job_1532757172540_0005 failed with state FAILED due to: Task failed task_1532757172540_0005_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

解決辦法:

根據這個連結的提示http://www.dataguru.cn/thread-346511-1-1.html

在另外兩臺安裝mysql後就不再提示錯誤了。

 

5、RDBMS中的資料匯入到Hive中:

執行後卻報錯了:

18/07/30 16:49:44 INFO mapreduce.ImportJobBase: Transferred 42.3291 KB in 55.0343 seconds (787.5991 bytes/sec)
18/07/30 16:49:44 INFO mapreduce.ImportJobBase: Retrieved 197 records.
18/07/30 16:49:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user` AS t LIMIT 1
18/07/30 16:49:45 INFO hive.HiveImport: Loading uploaded data into Hive
18/07/30 16:49:45 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
18/07/30 16:49:45 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
	... 12 more

這是因為sqoop需要一個hive的包,將hive/lib中的hive-common-1.1.0-cdh5.14.2.jar拷貝到sqoop的lib目錄中

再次執行,又是另外一個錯誤:

18/07/30 17:06:05 INFO hive.HiveImport: Loading uploaded data into Hive
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:370)
	at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:108)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.ShimLoader
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 17 more

將hive的lib目錄下的hive-shims-common-1.1.0-cdh5.14.2.jar複製sqoop的lib目錄下,又是另外一個錯誤:

18/07/30 17:14:14 INFO hive.HiveImport: Loading uploaded data into Hive
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:108)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.RuntimeException: Could not load shims in class org.apache.hadoop.hive.shims.Hadoop23Shims
	at org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:144)
	at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:136)
	at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:95)
	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:370)
	... 16 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.Hadoop23Shims
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:141)
	... 19 more

都是關於org.apache.hadoop.hive.shims的報錯,索性將Hive的lib目錄裡面關於shims的包都複製到sqoop的lib目錄:

cp -r hive-shims* /opt/cdh5.14.2/sqoop-1.4.6/lib/

再次執行,真的就不再報錯了 (*^▽^*)

查詢tb_user表,資料匯入成功.