1. 程式人生 > >Hive使用者介面(二)—使用Hive JDBC驅動連線Hive操作例項

Hive使用者介面(二)—使用Hive JDBC驅動連線Hive操作例項

問題導讀:

        1、Hive提供了哪三種使用者訪問方式?

        2、使用HiveServer時候,需要首先啟動哪個服務?

        3、HiveServer的啟動命令是?

        4、HiveServer是通過哪個服務來提供遠端JDBC訪問的?

        5、如何修改HiveServer的預設啟動埠?

        6、Hive JDBC驅動連線需要哪些包?

        7、HiveServer2與HiveServer在使用上的不同點?

        Hive提供了三種使用者介面:CLI、HWI和客戶端。其中客戶端即是使用JDBC驅動通過thrift,遠端操作Hive。HWI即提供Web介面遠端訪問Hive,可參考我的另外一篇博文:

Hive使用者介面(一)—Hive Web介面HWI的操作及使用。但是最常見的使用方式還是使用CLI方式。下面介紹Hive使用JDBC驅動連線操作Hive,我的Hive版本是Hive-0.13.1。

         Hive JDBC驅動連線分為兩種,早期的是HiveServer,最新的是HiveServer2,前者本身存在很多的問題,如安全性、併發性等,後者很好的解決了諸如安全性和併發性等問題。我先介紹HiveServer的用法。

一、啟動元資料MetaStore

        使用任何方式連線Hive,都首先需要啟動Hive元資料服務,否則執行HQL操作無法進行。

[[email protected]
~]$ hive --service metastore Starting Hive Metastore Server 15/01/11 20:11:56 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 15/01/11 20:11:56 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 15/01/11 20:11:56 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 15/01/11 20:11:56 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 15/01/11 20:11:56 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
二、啟動HiveServer服務

        HiveServer使用thrift服務來為客戶端提供遠端連線的訪問埠,在JDBC連線Hive之前必須先啟動HiveServer。

[[email protected] ~]$ hive --service hiveserver
Starting Hive Thrift Server
15/01/12 10:22:54 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/01/12 10:22:54 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/01/12 10:22:54 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/01/12 10:22:54 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
15/01/12 10:22:54 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
         hiveserver預設埠是10000,可以使用hive --service hiveserver -p 10002,更改預設啟動埠,此埠也是JDBC連線埠。

         注意:hiveserver不能和hwi服務同時啟動使用。

三、在IDE中建立Hive工程

     我們使用Eclipse作為開發IDE,在Eclipse中建立hive工程,並匯入Hive JDBC遠端連線相關包,所需的包如下所示:

        hive-jdbc-0.13.1.jar
        commons-logging-1.1.3.jar
        hive-exec-0.13.1.jar
        hive-metastore-0.13.1.jar
        hive-service-0.13.1.jar
        libfb303-0.9.0.jar
        slf4j-api-1.6.1.jar
        hadoop-common-2.2.0.jar
        log4j-1.2.16.jar
        slf4j-nop-1.6.1.jar
        httpclient-4.2.5.jar
        httpcore-4.2.5.jar
四、編寫連線與查詢程式碼
package com.gxnzx.hive;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class HiveServer2 {

private static Connection conn=null;

        public static void main(String args[]){

                try {
                          Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver");

                          conn=DriverManager.getConnection("jdbc:hive://192.168.2.133:10000/hive", "hadoopUser", "");

                          Statement st=conn.createStatement();

                          String sql1="select name,age from log";
                         
                          ResultSet rs=st.executeQuery(sql1);

                          while(rs.next()){

                                  System.out.println(rs.getString(1)+"     "+rs.getString(2));
                          }

                } catch (ClassNotFoundException e) {

                        e.printStackTrace();
                } catch (SQLException e) {

                        e.printStackTrace();
                }
        }
}
         其中:org.apache.hive.jdbc.HiveDriver是Hive JDBC連線驅動名,使用DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", "");建立連線。執行結果如下:
    Tom     19
    Jack     21
    HaoNing     12
    Hadoop     20
    Rose     23
五、HiveServer2與HiveServer的區別

         hiveserver2在安全性和併發性等方面比hiveserver好,在JDBC實現上面差別不大,主要有以下方面不同:

        1、服務啟動不一樣,首先要啟動hiveserver2服務

[[email protected] ~]$ hive --service hiveserver2
Starting HiveServer2
15/01/12 10:13:42 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/01/12 10:13:42 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/01/12 10:13:42 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/01/12 10:13:42 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
15/01/12 10:13:42 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive

        2、驅動名不一樣

HiveServer—>org.apache.hadoop.hive.jdbc.HiveDriver

HiveServer2—>org.apache.hive.jdbc.HiveDriver
       3、建立連線不一樣
HiveServer—>DriverManager.getConnection("jdbc:hive://<host>:<port>", "<user>", "");

HiveServer2—>DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", "");
        4、完整例項
package com.gxnzx.hive;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class HiveJDBCTest {

        private static Connection conn=null;

        public static void main(String args[]){

                try {
                          Class.forName("org.apache.hive.jdbc.HiveDriver");

                          conn=DriverManager.getConnection("jdbc:hive2://192.168.2.133:10000/hive", "hadoopUser", "");

                          Statement st=conn.createStatement();

                          String sql1="select name,age from log";

                          ResultSet rs=st.executeQuery(sql1);

                          while(rs.next()){

                                  System.out.println(rs.getString(1)+"     "+rs.getString(2));
                          }

                } catch (ClassNotFoundException e) {

                        e.printStackTrace();
                } catch (SQLException e) {

                        e.printStackTrace();
                }


        }
}
附:相關異常及解決辦法

       異常或錯誤一

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
Failed to load class org.slf4j.impl.StaticLoggerBinder
     官方解決方法
This error is reported when the org.slf4j.impl.StaticLoggerBinder class could not be loaded into memory. This happens when no appropriate SLF4J binding could be found on the class path. Placing one (and only one) of slf4j-nop.jar, slf4j-simple.jar, slf4j-log4j12.jar, slf4j-jdk14.jar or logback-classic.jar on the class path should solve the problem.

since 1.6.0 As of SLF4J version 1.6, in the absence of a binding, SLF4J will default to a no-operation (NOP) logger implementation.
     將slf4j-nop.jar, slf4j-simple.jar, slf4j-log4j12.jar, slf4j-jdk14.jar 或者logback-classic.jar中的任何一個匯入到工程lib下,slf4j相關包下載地址如下:slf4j bindings
    異常或錯誤二
Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=anonymous, 

access=EXECUTE, inode="/tmp":hadoopUser:supergroup:drwx------
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:187)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:150)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5185)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5167)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOwner(FSNamesystem.java:5123)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermissionInt(FSNamesystem.java:1338)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1317)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:528)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setPermission

(ClientNamenodeProtocolServerSideTranslatorPB.java:348)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod

(ClientNamenodeProtocolProtos.java:59576)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
       執行程式的時候,報上述錯誤,是因為一開始我的連線內容是如下方式,沒有新增使用者,此處的使用者不應該是hive的使用者,而應該是Hadoop的使用者:
conn=DriverManager.getConnection("jdbc:hive2://192.168.2.133:10000/hive", "", "");

       解決辦法:
conn=DriverManager.getConnection("jdbc:hive2://192.168.2.133:10000/hive", "hadoopUser", "");
       hadoopUser 是我Hadoop的使用者,新增後使用正常。