1. 程式人生 > 實用技巧 >hive beeline詳解

hive beeline詳解

來源於https://www.cnblogs.com/lenmom/p/11218807.html

Hive客戶端工具後續將使用Beeline替代HiveCLI,並且後續版本也會廢棄掉HiveCLI客戶端工具,Beeline是Hive 0.11版本引入的新命令列客戶端工具,它是基於SQLLine CLI的JDBC客戶端。

Beeline支援嵌入模式(embedded mode)和遠端模式(remote mode)。在嵌入式模式下,執行嵌入式的Hive(類似Hive CLI),而遠端模式可以通過Thrift連線到獨立的HiveServer2程序上。從Hive 0.14版本開始,Beeline使用HiveServer2工作時,它也會從HiveServer2輸出日誌資訊到STDERR。

1.beeline 的常用引數

Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u <database url>               the JDBC URL to connect to
   -n <username>                   the username to connect as
   -p <password>                   the password to connect as
   -d <driver class>               the driver class to use
   -i <init file>                  script file for initialization
   -e <query>                      query that should be executed
   -f <exec file>                  script file that should be executed
   -w (or) --password-file <password file>  the password file to read password from
   --hiveconf property=value       Use value for given property
   --hivevar name=value            hive variable name and value
                                   This is Hive specific settings in which variables
                                   can be set at session level and referenced in Hive
                                   commands or queries.
   --color=[true/false]            control whether color is used for display
   --showHeader=[true/false]       show column names in query results
   --headerInterval=ROWS;          the interval between which heades are displayed
   --fastConnect=[true/false]      skip building table/column list for tab-completion
   --autoCommit=[true/false]       enable/disable automatic transaction commit
   --verbose=[true/false]          show verbose error messages and debug info
   --showWarnings=[true/false]     display connection warnings
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]        format numbers using DecimalFormat pattern
   --force=[true/false]            continue running script even after errors
   --maxWidth=MAXWIDTH             the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTH    the maximum width to use when displaying columns
   --silent=[true/false]           be more silent
   --autosave=[true/false]         automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for result display
                                   Note that csv, and tsv are deprecated - use csv2, tsv2 instead
  --truncateTable=[true/false]    truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER     specify the delimiter for delimiter-separated values output format (default: |)
   --isolation=LEVEL               set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of printing null as empty string
   --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side
   --addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in the beeline client side
   --help                          display this message
Beeline version 2.3.4.spark2 by Apache Hive

引數詳解

選項

描述
-u<database URL>

用於JDBC URL連線。用例:beeline -udb_URL

-r

重新連線到最近使用過的URL(如果使用者有預先使過的用的,用!connect生成URL,用!save生成beeline.properties.file)。

用例:beeline -r

Version: 2.1.0 (HIVE-13670)

-n<username>

連線時使用的使用者名稱。用例:beeline -nvalid_user

-p<password>

連線時使用的密碼。用例:beeline -pvalid_password 可選的密碼模式:從Hive 2.2.0開始引數-p選項是可選的。

用例 : beeline -p [valid_password]

如果密碼不是在-p之後提供的,Beeline將在初始化連線時提示輸入密碼。當密碼提供後Beeline會用它來初始化連線而不提示。

-d<driver class>

配置使用的驅動類用例:beeline -ddriver_class

-e<query>

應該執行的查詢。查詢語句兩端用單引號和雙引號。這個選項被使用多次。用例: beeline -e "query_string"
支援運行復雜的SQL語句,在一個語句中通過使用分號分隔。(HIVE-9877)
Bug fix (null pointer exception): 0.13.0(HIVE-5765)
Bug fix (--headerInterval not honored): 0.14.0(HIVE-7647)
Bug fix (running -e in background): 1.3.0 and 2.0.0 (HIVE-6758);workaround availablefor earlier versions

-f<file>

需要被執行的指令碼檔案。用例:beeline -ffilepath
Version: 0.12.0(HIVE-4268)
注:如果腳本里麵包含tabs,版本0.12.0中查詢編譯失敗,這個bug已經在0.13.0版本修復了。(HIVE-6359).
Bug fix (running-fin background):1.3.0 and 2.0.0 (HIVE-6758); workaround availablefor earlier versions

-i(or)--init<file or files>

初始化需要的初始檔案。用例: beeline -i /tmp/initfile
單個檔案:Version: 0.14.0(HIVE-6561)
多個檔案:Version: 2.1.0(HIVE-11336)

-w(or)--password-file<password file> 從檔案中讀取密碼。Version: 1.2.0(HIVE-7175)
-a(or)--authType<auth type> jdbc的認證型別是一個身份認證屬性。Version: 0.13.0(HIVE-5155)
--property-file<file>

讀取配置屬性的檔案用例: beeline --property-file /tmp/a

Version: 2.2.0(HIVE-13964)

--hiveconfproperty=value 為給定的配置屬性賦值。 在hive.conf.restricted.list列表中的屬性不能通過hiveconf的方式重置。(see Restricted List and Whitelist).

用例:beeline --hiveconfprop1=value1
Version: 0.13.0(HIVE-6173)

--hivevarname=value Hive的變數名和變數值。這是一個Hive指定的設定,在這變數能夠在會話級別被設定和被Hive命令和查詢引用。

用例: beeline --hivevar var1=value1

--color=[true/false] 制顏色是否被用來展示。預設是false 用例: beeline --color=true

(不支援分隔的值輸出方式。SeeHIVE-9770)

--showHeader=[true/false] 展示列名是否在查詢結果中。預設是true。用例: beeline --showHeader=false
--headerInterval=ROWS

當輸出為表格時,重新顯示列頭時他們之間的間隔,用行數計算。預設值為100 用例: beeline --headerInterval=50
(不支援分隔的值輸出方式。SeeHIVE-9770)

--fastConnect=[true/false] 連線時,跳過為HiveQL語法的tab鍵自動補全功能而構建所有表和列的清單,預設為true不構建該列表。 用例: beeline --fastConnect=false
--autoCommit=[true/false] 允許或者禁止自動事務執行。預設是false 用例: beeline --autoCommit=true
--verbose=[true/false] 展示冗長的報錯資訊和除錯資訊(true)或者不展示(false),預設是false 用例: beeline --verbose=true
--showWarnings=[true/false] Default is false.連線時,在執行任意HiveQL命令後展示警告資訊。預設是false。 用例: beeline --showWarnings=true
--showDbInPrompt=[true/false] 在提示符裡面展示當前資料庫名字。預設是false。用例:beeline --showDbInPrompt=true

Version: 2.2.0 (HIVE-14123)

--showNestedErrs=[true/false]

展示內部錯誤,預設是false。用例:beeline --showNestedErrs=true

--numberFormat=[pattern] 用一個小數格式的模板來格式化數字。用例:beeline --numberFormat="#,###,##0.00"
--force=[true/false] 出錯後繼續執行指令碼(true),或者不執行(false)。預設是false。用例:beeline--force=true
--maxWidth=MAXWIDTH 當輸出格式是表格時,在截斷資料前展示的最大寬度。預設是查詢時的終端的當前寬度,然後回到80。用例:beeline --maxWidth=150
--maxColumnWidth=MAXCOLWIDTH 當輸出是表格時,最大列寬,Hive 2.2.0以後預設是50,之前的版本是15。用例:beeline --maxColumnWidth=25
--silent=[true/false] 是(true)否(false)減少展示的資訊量。它也會停止展示HiveServer2(Hive 0.14及之後的版本)的查詢和命令(Hive 1.2.0及之後的版本)日誌資訊,預設是false。用例:beeline --silent=true
--autosave=[true/false] 自動儲存引數選擇(true)或者不儲存(false)。預設是false。用例:beeline --autosave=true
--outputformat=[table/vertical/csv/tsv/dsv/csv2/tsv2] 結果展示的模式。預設是表格。查閱下方的Separated-Value Output Formats獲取更多資訊和推薦選項。用例:beeline --outputformat=tsv

版本號: dsv/csv2/tsv2 added in 0.14.0 (HIVE-8615)

--truncateTable=[true/false] 如果是true,那麼當表格超出終端顯示寬度時,截斷表格的列在終端上展示。版本號: 0.14.0 (HIVE-6928)
--delimiterForDSV=DELIMITER 用於輸出格式中劃分值的界定符。預設是‘|’ 版本號: 0.14.0 (HIVE-7390)
--isolation=LEVEL

設定事務隔離級別為TRANSACTION_READ_COMMITTED或者TRANSACTION_SERIALIZABLE.可以查閱Java連線文件中“Field Detail”那一章節。

用例:beeline --isolation=TRANSACTION_SERIALIZABLE

--nullemptystring=[true/false] 使用歷史的列印空字元null的形式(true)還是使用當前列印空值的方式(false),預設是false。用例:beeline --nullemptystring=false

Version: 0.13.0 (HIVE-4485)

--incremental=[true/false] 從Hive 2.3版本往後預設是true,在它之前是預設為false。當設定為false時,為了最佳的展示列寬,完整的結果集會在展示之前被收集然後快取起來。當設定為true時,結果集一旦被抓取到就會立即展示,為了在展示列的填充額外消耗更少的延遲和記憶體。當你在客戶端遭遇一個記憶體溢位時,推薦設定--incremental=true(因為抓取到的結果集非常大)。
--incrementalBufferRows=NUMROWS

當列印行到標準輸出時,儲存在快取中的行數,預設是1000。只有當--incremental=true和--outputformat=table才適用。用例:beeline --incrementalBufferRows=1000

Version: 2.3.0 (HIVE-14170)

--maxHistoryRows=NUMROWS 儲存Beeline 歷史記錄的最大行數。Version: 2.3.0 (HIVE-15166)
--delimiter=; 設定Beeline的查詢語句分隔符。允許用多個字元的分隔符,但是引號,斜槓和--是不允許的,預設是分號; 用例:beeline --delimiter=$$

Version: 3.0.0 (HIVE-10865)

--convertBinaryArrayToString=[true/false]

展示二進位制列資料為字串或者位矩陣。用例:beeline --convertBinaryArrayToString=true

Version: 3.0.0 (HIVE-14786)

--help 展示一個幫助資訊。用例:beeline --help

2. beeline 使用示例

2.1不配置使用者名稱密碼

beeline connect有幾種方式,見hive-site.xml,預設為NONE

<property>
    <name>hive.server2.authentication</name>
    <value>NONE</value>
    <description>
      Expects one of [nosasl, none, ldap, kerberos, pam, custom].
      Client authentication types.
        NONE: no authentication check
        LDAP: LDAP/AD based authentication
        KERBEROS: Kerberos/GSSAPI authentication
        CUSTOM: Custom authentication provider
                (Use with property hive.server2.custom.authentication.class)
        PAM: Pluggable authentication module
        NOSASL:  Raw transport
    </description>
  </property>

此時連線方式為

lenmom@Mi1701 ~$ beeline
Beeline version 1.2.1.spark2 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/default
Connecting to jdbc:hive2://localhost:10000/default
Enter username for jdbc:hive2://localhost:10000/default:
Enter password for jdbc:hive2://localhost:10000/default:
。。。。
Connected to: Apache Hive (version 2.3.4)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10014/default> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| orc          |
+----------------+--+

2.2使用者名稱密碼登入

hive-site.xml配置

<property>
    <name>hive.server2.thrift.client.user</name>
    <value>lenmom</value>
    <description>Username to use against thrift client</description>
</property>
<property>
    <name>hive.server2.thrift.client.password</name>
    <value>123456</value>
    <description>Password to use against thrift client</description>
 </property>

使用密碼連線

beeline> !connect jdbc:hive2://localhost:10000/default
Connecting to jdbc:hive2://localhost:10000/default
Enter username for jdbc:hive2://localhost:10000/default: lenmom
Enter password for jdbc:hive2://cdh-server2:10000/default: *****
Connected to: Apache Hive (version 2.3.4)
Driver: Hive JDBC (version 2.3.4)
Transaction isolation: TRANSACTION_REPEATABLE_READ

注意這裡設定的使用者要求對inode="/tmp/hive" 有執行許可權,否則會出現下列問題:

Connecting to jdbc:hive2://localhost:10000/default
Enter username for jdbc:hive2://localhost:10000/default: lenmom
Enter password for jdbc:hive2://localhost:10000/default: **
Error: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=lenmom, access=EXECUTE, inode="/tmp/hive":root:supergroup:drwxrwx---
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
    at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1698)
    at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3817)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1005)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)

2.3beeline連線hiveserver2使用使用者名稱密碼

beeline -u "jdbc:hive2://localhost:10000"  -n lenmom -p  123456
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
scan complete in 2ms
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 2.3.4)
Driver: Hive JDBC (version 2.3.4)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 2.3.4 by Apache Hive
0: jdbc:hive2://localhost:10000>

2.4 類似hive-cli 的執行指令碼功能

nohup beeline -u jdbc:hive2://127.0.0.1:10000  -n lenmom -p 123456  --color=true --silent=false  \
--hivevar p_date=${partitionDate} --hivevar f_date=${fileLocDate}  \
-f hdfs_add_partition_dmp_clearlog.hql  >> $logdir/load_${curDate}.log

3.beeline支援的操作

type command !help in beeline terminal

!help

output:

!addlocaldriverjar  Add driver jar file in the beeline client side.
!addlocaldrivername Add driver name that needs to be supported in the beeline
                    client side.
!all                Execute the specified SQL against all the current connections
!autocommit         Set autocommit mode on or off
!batch              Start or execute a batch of statements
!brief              Set verbose mode off
!call               Execute a callable statement
!close              Close the current connection to the database
!closeall           Close all current open connections
!columns            List all the columns for the specified table
!commit             Commit the current transaction (if autocommit is off)
!connect            Open a new connection to the database.
!dbinfo             Give metadata information about the database
!describe           Describe a table
!dropall            Drop all tables in the current database
!exportedkeys       List all the exported keys for the specified table
!go                 Select the current connection
!help               Print a summary of command usage
!history            Display the command history
!importedkeys       List all the imported keys for the specified table
!indexes            List all the indexes for the specified table
!isolation          Set the transaction isolation for this connection
!list               List the current connections
!manual             Display the BeeLine manual
!metadata           Obtain metadata information
!nativesql          Show the native SQL for the specified statement
!nullemptystring    Set to true to get historic behavior of printing null as
                    empty string. Default is false.
!outputformat       Set the output format for displaying results
                    (table,vertical,csv2,dsv,tsv2,xmlattrs,xmlelements, and
                    deprecated formats(csv, tsv))
!primarykeys        List all the primary keys for the specified table
!procedures         List all the procedures
!properties         Connect to the database specified in the properties file(s)
!quit               Exits the program
!reconnect          Reconnect to the database
!record             Record all output to the specified file
!rehash             Fetch table and column names for command completion
!rollback           Roll back the current transaction (if autocommit is off)
!run                Run a script from the specified file
!save               Save the current variabes and aliases
!scan               Scan for installed JDBC drivers
!script             Start saving a script to a file
!set                Set a beeline variable
!sh                 Execute a shell command
!sql                Execute a SQL command
!tables             List all the tables in the database
!typeinfo           Display the type map for the current connection
!verbose            Set verbose mode on

4. 常用的幾個command

  • !connect url –連線不同的Hive2伺服器
  • !exit –退出shell
  • !help –顯示全部命令列表
  • !verbose –顯示查詢追加的明細