Hadoop-2.7.3環境下Hive-2.1.1安裝配置。
環境:ubuntu-16.0.4;jdk1.8.0_111;apache-hadoop-2.7.3;apache-hive-2.1.1。
這裡只記錄Hive的安裝。Hive只需要安裝到一個節點上即可。我這裡是裝在Namenode上的。
首先從官網上下載所需要的版本,本人下載的apache-hive-2.1.1-bin.tar.gz。放到使用者主目錄下面。
(1)解壓:
$tar -zxvf apache-hive-2.1.1-bin.tar.gz
(2)進入到conf目錄:
$cd apache-hive-2.1.1-bin/bin/conf
$ls
會看到有下面這些檔案:
beeline-log4j2.properties.template hive-exec-log4j2.properties.template llap-cli-log4j2.properties.template
hive-default.xml.template hive-log4j2.properties.template llap-daemon-log4j2.properties.template
hive-env.sh.template ivysettings.xml parquet-logging.properties
然後在conf路徑下,執行以下幾個命令
$cp hive-default.xml.template hive-default.xml
$cp hive-env.sh.template hive-env.sh
$cp hive-default.xml hive-site.xml
(3)新增mysql驅動:
下載mysql-connector-java-x.y.z-bin.jar檔案並放到apache-hive-2.1.1-bin/lib目錄下面。
(4)設定路徑及環境變數:
$sudo mv apache-hive-2.1.1-bin /usr/local/
新增HIVE_HOME。
source /etc/profile
(5)修改hive-site.xml及hive-env.sh相關配置
將hive-site.xml檔案中的內容修改為如下所示:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.querylog.location</name>
<value>/user/hivetmp</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.244.3:3306/hive?createDatabaseIfNotExist=true&useSSL=false&useUnicode=true&characterEncoding=UTF-8</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.244.3:9083</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
</configuration>
將hive-env.sh檔案修改為如下所示:
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Set Hive and Hadoop environment variables here. These variables can be used
# to control the execution of Hive. It should be used by admins to configure
# the Hive installation (so that users do not have to set environment variables
# or set command line parameters to get correct behavior).
#
# The hive service being invoked (CLI/HWI etc.) is available via the environment
# variable SERVICE
# Hive Client memory usage can be an issue if a large number of clients
# are running at the same time. The flags below have been useful in
# reducing memory usage:
#
# if [ "$SERVICE" = "cli" ]; then
# if [ -z "$DEBUG" ]; then
# export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
# else
# export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
# fi
# fi
# The heap size of the jvm stared by hive shell script can be controlled via:
#
# export HADOOP_HEAPSIZE=1024
export HADOOP_HEAPSIZE=1024
#
# Larger heap size may be required when running queries over large number of files or partitions.
# By default hive shell scripts use a heap size of 256 (MB). Larger heap size would also be
# appropriate for hive server (hwi etc).
# Set HADOOP_HOME to point to a specific hadoop install directory
# HADOOP_HOME=${bin}/../../hadoop
HADOOP_HOME=/usr/local/hadoop #這裡設定成自己的hadoop路徑
# Hive Configuration Directory can be controlled by:
# export HIVE_CONF_DIR=
export HIVE_CONF_DIR=/usr/local/apache-hive-2.1.1-bin/conf
# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
# export HIVE_AUX_JARS_PATH=
export HIVE_AUX_JARS_PATH=/usr/local/apache-hive-2.1.1-bin/lib
(6)在mysql裡建立hive使用者,並賦予其足夠許可權
1.$mysql -u root -p
2.mysql> create user 'hive' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
mysql> grant all privileges on *.* to 'hive' with grant option;
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
(7)設定元資料庫
$schematool -initSchema -dbType mysql
看到有completed提示,即設定元資料庫成功。
(8)修改日誌存放位置
日誌分為系統日誌與Job日誌,Job日誌在hive-sitel.xml的hive.querylog.location屬性已經配置,現在只需要配置系統日誌即可。
在conf目錄下,有個hive-log4j2.properties.template檔案,執行命令
cp hive-log4j2.properties.template hive-log4j2.properties複製一份。
然後修改hive-log4j2.properties中的
property.hive.log.dir =你想要的路徑
完成以上配置後,這個時候,節點需要啟動metastore服務:
$hive --service metastore &
如果一直沒反應,按回車,可以使用jobs命令檢視是否啟動成功
啟動成功後,就可以執行hive命令。