Hive學習之路（十八）Hive的Shell操作

阿新 • • 發佈：2018-04-15

int one 依次也會 not show div ble ive 遞增

一、Hive的命令行

1、Hive支持的一些命令

Command Description

quit Use quit or exit to leave the interactive shell.

set key=value Use this to set value of particular configuration variable. One thing to note here is that if you misspell the variable name, cli will not show an error.

set This will print a list of configuration variables that are overridden by user or hive.

set -v This will print all hadoop and hive configuration variables.

add FILE [file] [file]* Adds a file to the list of resources

add jar jarname

list FILE list all the files added to the distributed cache

list FILE [file]* Check if given resources are already added to distributed cache

! [cmd]

Executes a shell command from the hive shell

dfs [dfs cmd] Executes a dfs command from the hive shell

[query] Executes a hive query and prints results to standard out

source FILE Used to execute a script file inside the CLI.

2、語法結構

hive [-hiveconf x=y]* [<-i filename>]* [<-f filename>|<-e query-string 
>] [-S]

說明：

1、-i 從文件初始化 HQL

2、-e 從命令行執行指定的 HQL

3、-f 執行 HQL 腳本

4、-v 輸出執行的 HQL 語句到控制臺

5、-p connect to Hive Server on port number

6、-hiveconf x=y（Use this to set hive/hadoop configuration variables）

7、-S：表示以不打印日誌的形式執行命名操作

3、示例

（1）運行一個查詢

[hadoop@hadoop3 ~]$ hive -e "select * from cookie.cookie1;"

技術分享圖片

（2）運行一個文件

編寫hive.sql文件

技術分享圖片

運行編寫的文件

技術分享圖片

（3）運行參數文件

從配置文件啟動 hive，並加載配置文件當中的配置參數

技術分享圖片

二、Hive的參數配置方式

1、Hive的參數配置大全

https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties

2、Hive的參數設置方式

開發 Hive 應用時，不可避免地需要設定 Hive 的參數。設定 Hive 的參數可以調優 HQL 代碼的執行效率，或幫助定位問題。然而實踐中經常遇到的一個問題是，為什麽設定的參數沒有起作用？這通常是錯誤的設定方式導致的

對於一般參數，有以下三種設定方式：

1、配置文件（全局有效）

2、命令行參數（對 hive 啟動實例有效）

3、參數聲明（對 hive 的連接 session 有效）

（1）配置文件

Hive 的配置文件包括：

　　A.　用戶自定義配置文件：$HIVE_CONF_DIR/hive-site.xml

　　B.　默認配置文件：$HIVE_CONF_DIR/hive-default.xml

用戶自定義配置會覆蓋默認配置。

另外，Hive 也會讀入 Hadoop 的配置，因為 Hive 是作為 Hadoop 的客戶端啟動的，Hive 的配置會覆蓋 Hadoop 的配置。

配置文件的設定對本機啟動的所有 Hive 進程都有效。

（2）命令行參數

啟動 Hive（客戶端或 Server 方式）時，可以在命令行添加-hiveconf param=value 來設定參數，例如：

技術分享圖片

這一設定對本次啟動的 session（對於 server 方式啟動，則是所有請求的 session）有效。

（3）參數聲明

可以在 HQL 中使用 SET 關鍵字設定參數，例如：

技術分享圖片

這一設定的作用域也是 session 級的。

set hive.exec.reducers.bytes.per.reducer= 每個 reduce task 的平均負載數據量 Hive 會估算總數據量，然後用該值除以上述參數值，就能得出需要運行的 reduceTask 數

set hive.exec.reducers.max= 設置 reduce task 數量的上限

set mapreduce.job.reduces= 指定固定的 reduce task 數量

但是，這個參數在必要時<業務邏輯決定只能用一個 reduce task> hive 會忽略，比如在設置了 set mapreduce.job.reduces = 3，但是 HQL 語句當中使用了 order by 的話，那麽就會忽略該參數的設置。

上述三種設定方式的優先級依次遞增。即參數聲明覆蓋命令行參數，命令行參數覆蓋配置文件設定。註意某些系統級的參數，例如 log4j 相關的設定，必須用前兩種方式設定，因為那些參數的讀取在 session 建立以前已經完成了。

Hive學習之路（十八）Hive的Shell操作

int one 依次也會 not show div ble ive 遞增一、Hive的命令行 1、Hive支持的一些命令 Command Description quit Use quit or exit to leave the interactive sh

Hive學習之路（十八）Hive的Shell操作

一、Hive的命令行

1、Hive支持的一些命令

2、語法結構

3、示例

（1）運行一個查詢

（2）運行一個文件

（3）運行參數文件

二、Hive的參數配置方式

1、Hive的參數配置大全

2、Hive的參數設置方式

（1）配置文件

（2）命令行參數

（3）參數聲明

Hive學習之路（十八）Hive的Shell操作

Hadoop學習之路（十八）MapReduce框架Combiner分區

Hive學習之路（十五）Hive分析窗口函數(三) CUME_DIST和PERCENT_RANK

angularJS學習之路（十八）---自定義指令執行過程

Hive學習之路（十）Hive的高級操作

python學習之路（十二）

Hadoop學習之路（十九）MapReduce框架排序

Hadoop學習之路（十七）MapReduce框架Partitoner分區

Spark學習之路（十一）SparkCore的調優之Spark內存模型

Spark學習之路（十二）SparkCore的調優之資源調優JVM的基本架構

Spark學習之路（十二）SparkCore的調優之資源調優

Spark學習之路（十四）SparkCore的調優之資源調優JVM的GC垃圾收集器

Spark學習之路（十五）SparkCore的源碼解讀（一）啟動腳本

JAVA基礎學習之路（十二）鏈表

Kubernetes學習之路（十一）之資源清單定義

Android破解學習之路（十二）—— GP錄影漢化過程及添加布局

Git+Jenkins學習之路（十四）之自動化指令碼部署實踐

Python小白學習之路（十四）—【作用域】【匿名函式】【程式設計方法論】【高階函式】

Python小白學習之路（十五）—【map()函式】【filter()函式】【reduce()函式】

Python小白學習之路（十六）—【內置函數一】

Hive學習之路 （十八）Hive的Shell操作

一、Hive的命令行

1、Hive支持的一些命令

2、語法結構

3、示例

（1）運行一個查詢

（2）運行一個文件

（3）運行參數文件

二、Hive的參數配置方式

1、Hive的參數配置大全

2、Hive的參數設置方式

（1）配置文件

（2）命令行參數

（3）參數聲明

相關推薦

Hive學習之路（十八）Hive的Shell操作