記錄初學者學習Hive時踩過的坑
- 1. 缺少MySQL驅動包
- 1.1 問題描述
Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:58)
at org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
-
- 1.2. 解決方案
上述問題很可能是缺少mysql的jar包,下載mysql-connector-java-5.1.32.tar.gz,複製到hive的lib目錄下:
[email protected]:~$ cp mysql-connector-java-5.1.34-bin.jar opt/hive-2.1.0/lib/
- 2. 元資料庫mysql初始化
- 2.1 問題描述
執行./hive指令碼時,無法進入,報錯:
Exception in thread "main" java.lang.RuntimeException: Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql)
-
- 2.2 解決方案
在scripts目錄下執行 schematool -initSchema -dbType mysql命令進行Hive元資料庫的初始化:
[email protected]:~/opt/hive-2.1.0/scripts$ schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/xiaosi/opt/hive-2.1.0/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/xiaosi/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://localhost:3306/hive_meta?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.mysql.sql
Initialization script completed
schemaTool completed
- 3. Relative path in absolute URI
- 3.1 問題描述
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
...
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
at java.net.URI.checkPath(URI.java:1823)
at java.net.URI.<init>(URI.java:745)
at org.apache.hadoop.fs.Path.initialize(Path.java:202)
... 12 more
-
- 3.2 解決方案
產生上述問題的原因是使用了沒有配置的變數,解決此問題只需在配置檔案hive-site.xml中配置system:user.name和system:java.io.tmpdir兩個變數,配置檔案中就可以使用這兩個變數:
<property>
<name>system:user.name</name>
<value>xiaosi</value>
</property>
<property>
<name>system:java.io.tmpdir</name>
<value>/home/${system:user.name}/tmp/hive/</value>
</property>
- 4. 拒絕連線
- 4.1 問題描述
on exception: java.net.ConnectException: 拒絕連線; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
...
Caused by: java.net.ConnectException: Call From Qunar/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: 拒絕連線; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
...
Caused by: java.net.ConnectException: 拒絕連線
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 29 more
-
- 4.2 解決方案
有可能是Hadoop沒有啟動,使用jps檢視一下當前程序發現:
[email protected]:~/opt/hive-2.1.0$ jps
7317 Jps
可以看見,我們確實沒有啟動Hadoop。開啟Hadoop的NameNode和DataNode守護程序
[email protected]:~/opt/hadoop-2.7.3$ ./sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/xiaosi/opt/hadoop-2.7.3/logs/hadoop-xiaosi-namenode-yoona.out
localhost: starting datanode, logging to /home/xiaosi/opt/hadoop-2.7.3/logs/hadoop-xiaosi-datanode-yoona.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/xiaosi/opt/hadoop-2.7.3/logs/hadoop-xiaosi-secondarynamenode-yoona.out
[email protected]:~/opt/hadoop-2.7.3$ jps
8055 Jps
7561 NameNode
7929 SecondaryNameNode
7724 DataNode
- 5. 建立Hive表失敗
- 5.1 問題描述
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)
-
- 5.2 解決方案
檢視Hive日誌,看到這樣的錯誤日誌:
NestedThrowablesStackTrace:
Could not create "increment"/"table" value-generation container `SEQUENCE_TABLE` since autoCreate flags do not allow it.
org.datanucleus.exceptions.NucleusUserException: Could not create "increment"/"table" value-generation container `SEQUENCE_TABLE` since autoCreate flags do not allow it.
出現上述問題主要因為mysql的bin-log format預設為statement ,在mysql中通過 show variables like 'binlog_format'; 語句檢視bin-log format的配置值
mysql> show variables like 'binlog_format';
+---------------+-----------+
| Variable_name | Value |
+---------------+-----------+
| binlog_format | STATEMENT |
+---------------+-----------+
1 row in set (0.00 sec)
修改bin-log format的預設值,在mysql的配置檔案/etc/mysql/mysql.conf.d/mysqld.cnf中新增 binlog_format="MIXED" ,重啟mysql,再啟動 hive即可。
mysql> show variables like 'binlog_format';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | MIXED |
+---------------+-------+
1 row in set (0.00 sec)
再次執行創表語句:
hive> create table if not exists employees(
> name string comment '姓名',
> salary float comment '工資',
> subordinates array<string> comment '下屬',
> deductions map<string,float> comment '扣除金額',
> address struct<city:string,province:string> comment '家庭住址'
> )
> comment '員工資訊表'
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
OK
Time taken: 0.664 seconds
- 6. 載入資料失敗
- 6.1 問題描述
hive> load data local inpath '/home/xiaosi/hive/input/result.txt' overwrite into table recent_attention;
Loading data to table test_db.recent_attention
Failed with exception Unable to move source file:/home/xiaosi/hive/input/result.txt to destination hdfs://localhost:9000/user/hive/warehouse/test_db.db/recent_attention/result.txt
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
檢視Hive日誌,看到這樣的錯誤日誌:
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /home/xiaosi/hive/warehouse/recent_attention/result.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
看到 0 datanodes running 我們猜想可能datanode掛掉了,jps驗證一下,果然我們的datanode沒有啟動起來。
-
- 6.2 問題解決
這個問題是由於datanode沒有啟動導致的,至於datanode為什麼沒有啟動起來,去看另一篇博文:那些年踩過的Hadoop坑(http://blog.csdn.net/sunnyyoona/article/details/51659080)
- 7. Java連線Hive 驅動失敗
- 7.1 問題描述
java.lang.ClassNotFoundException: org.apache.hadoop.hive.jdbc.HiveDriver
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_91]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_91]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) ~[na:1.8.0_91]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_91]
at java.lang.Class.forName0(Native Method) ~[na:1.8.0_91]
at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_91]
at com.sjf.open.hive.HiveClient.getConn(HiveClient.java:29) [classes/:na]
at com.sjf.open.hive.HiveClient.run(HiveClient.java:53) [classes/:na]
at com.sjf.open.hive.HiveClient.main(HiveClient.java:77) [classes/:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_91]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_91]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_91]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91]
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) [idea_rt.jar:na]
-
- 7.2 解決方案
private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";
取代
private static String driverName = "org.apache.hive.jdbc.HiveDriver"
- 8. create table問題
- 8.1 問題描述
create table if not exists employee(
name string comment 'employee name',
salary float comment 'employee salary',
subordinates array<string> comment 'names of subordinates',
deductions map<string,float> comment 'keys are deductions values are percentages',
address struct<street:string, city:string, state:string, zip:int> comment 'home address'
)
comment 'description of the table'
tblproperties ('creator'='yoona','date'='20160719')
location '/user/hive/warehouse/test.db/employee';
錯誤資訊:
FAILED: ParseException line 10:0 missing EOF at 'location' near ')'
-
- 8.2 解決方案
Location放在TBPROPERTIES之前:
create table if not exists employee(
name string comment 'employee name',
salary float comment 'employee salary',
subordinates array<string> comment 'names of subordinates',
deductions map<string,float> comment 'keys are deductions values are percentages',
address struct<street:string, city:string, state:string, zip:int> comment 'home address'
)
comment 'description of the table'
location '/user/hive/warehouse/test.db/employee'
tblproperties ('creator'='yoona','date'='20160719');
create table命令:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
- 9. JDBC Hive 拒絕連線
- 9.1 問題描述
15:00:50.815 [main] INFO org.apache.hive.jdbc.Utils - Supplied authorities: localhost:10000
15:00:50.832 [main] INFO org.apache.hive.jdbc.Utils - Resolved authority: localhost:10000
15:00:51.010 [main] DEBUG o.a.thrift.transport.TSaslTransport - opening transport [email protected]
15:00:51.019 [main] WARN org.apache.hive.jdbc.HiveConnection - Failed to connect to localhost:10000
15:00:51.027 [main] ERROR com.sjf.open.hive.HiveClient - Connection error!
java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000/default: java.net.ConnectException: 拒絕連線
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:219) ~[hive-jdbc-2.1.0.jar:2.1.0]
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:157) ~[hive-jdbc-2.1.0.jar:2.1.0]
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) ~[hive-jdbc-2.1.0.jar:2.1.0]
at java.sql.DriverManager.getConnection(DriverManager.java:664) ~[na:1.8.0_91]
at java.sql.DriverManager.getConnection(DriverManager.java:247) ~[na:1.8.0_91]
at com.sjf.open.hive.HiveClient.getConn(HiveClient.java:29) [classes/:na]
at com.sjf.open.hive.HiveClient.run(HiveClient.java:52) [classes/:na]
at com.sjf.open.hive.HiveClient.main(HiveClient.java:76) [classes/:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_91]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_91]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_91]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91]
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) [idea_rt.jar:na]
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: 拒絕連線
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:266) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:195) ~[hive-jdbc-2.1.0.jar:2.1.0]
... 12 common frames omitted
Caused by: java.net.ConnectException: 拒絕連線
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_91]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_91]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_91]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_91]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_91]
at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_91]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.9.3.jar:0.9.3]
... 15 common frames omitted
-
- 9.2 解決方案
(1) 檢查hive server2是否啟動:
[email protected]:/opt/apache-hive-2.0.0-bin/bin$ sudo netstat -anp | grep 10000
如果沒有啟動hive server2,首先啟動服務:
[email protected]:/opt/apache-hive-2.0.0-bin/conf$ hive --service hiveserver2 >/dev/null 2>/dev/null &
[1] 11978
(2) 檢查配置:
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description>
</property>
- 10. User root is not allowed to impersonate anonymous
- 10.1 問題描述
Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User:xiaosiis not allowed to impersonate anonymous
-
- 10.2 解決方案
修改hadoop 配置檔案 etc/hadoop/core-site.xml,加入如下配置項
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
備註
hadoop.proxyuser.XXX.hosts 與 hadoop.proxyuser.XXX.groups 中XXX為異常資訊中User:* 中的使用者名稱部分
<property>
<name>hadoop.proxyuser.xiaosi.hosts</name>
<value>*</value>
<description>The superuser can connect only from host1 and host2 to impersonate a user</description>
</property>
<property>
<name>hadoop.proxyuser.xiaosi.groups</name>
<value>*</value>
<description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>
</property>
- 11. 安全模式
- 11.1 問題描述
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/xiaosi/c2f6130d-3207-4360-8734-dba0462bd76c. Name node is in safe mode.
The reported blocks 22 has reached the threshold 0.9990 of total blocks 22. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 5 seconds.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1327)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3893)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:983)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy32.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:558)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy33.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3000)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2970)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1047)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1043)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1043)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1036)
at org.apache.hadoop.hive.ql.session.SessionState.createPath(SessionState.java:682)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:617)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:526)
... 9 more
-
- 11.2 問題分析
hdfs在啟動開始時會進入安全模式,這時檔案系統中的內容不允許修改也不允許刪除,直到安全模式結束。安全模式主要是為了系統啟動的時候檢查各個DataNode上資料塊的有效性,同時根據策略必要的複製或者刪除部分資料塊。執行期通過命令也可以進入安全模式。在實踐過程中,系統啟動的時候去修改和刪除檔案也會有安全模式不允許修改的出錯提示,只需要等待一會兒即可。
-
- 11.3 問題解決
可以等待其自動退出安全模式,也可以使用手動命令來離開安全模式:
[email protected]:~$ hdfs dfsadmin -safemode leave
Safe mode is OFF
轉載出處:https://blog.csdn.net/sunnyyoona/article/details/51648871