hive on spark安裝
Hive on spark安裝
1. 下載apache-hive-2.0.0-bin.tar.gz,安裝。(儘量安裝和hive相對應的版本spark和hadoop)
2. 編譯spark(不帶hive的)
./make-distribution.sh--name
"hadoop2-without-hive"
--tgz
"-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"
將編譯後的spark下的lib下的
spark-assembly jar
拷貝到hive的lib下。
3. hive-env.sh配置:
exportHIVE_AUX_JARS_PATH=/home/hadoop/app/apache-hive-2.0.0-bin/lib
exportHADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.6.0
export HIVE_CONF_DIR=/home/hadoop/app/apache-hive-2.0.0-bin/conf
exportHIVE_HOME=/home/hadoop/app/apache-hive-2.0.0-bin
export JAVA_HOME=/usr/java/jdk1.7.0_79
4. hive-site配置:
<!--在hdfs上hive資料存放目錄,啟動hadoop後需要在hdfs上手動建立 --> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <!--預設 metastore 在本地,新增配置改為非本地 <property> <name>hive.metastore.local</name> <value>false</value> </property>--> <property> <name>hive.metastore.uris</name> <value>thrift://bihdp01:9083</value> <description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/hive/warehouse</value> </property> <!--通過jdbc協議連線mysql的hive庫 --> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://bihdp01:3306/hiveto?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <!--jdbc的mysql驅動 --> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <!--mysql使用者名稱 --> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>username to use against metastore database</description> </property> <!--mysql使用者密碼 --> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>*********</value> <description>password to use against metastore database</description> </property> <!-- 設定為false,查詢將以執行hiveserver2程序的使用者執行--> <property> <name>hive.server2.enable.doAs</name> <value>ture</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>bihdp01</value> </property> <property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.exec.parallel</name> <value>true</value> </property> <property> <name>hive.exec.dynamic.partition.mode</name> <value>strict</value> </property> <property> <name>hive.exec.compress.intermediate</name> <value>true</value> </property> <!-- 配置hive 的web 頁面訪問的介面hwi , 主機 埠 war包的路徑--> <property> <name>hive.hwi.listen.host</name> <value>bihdp01</value> </property> <property> <name>hive.hwi.listen.port</name> <value>9999</value> </property> <property> <name>hive.hwi.war.file</name> <value>lib/hive-hwi-1.2.1.war</value> </property> <property> <name>spark.eventLog.enabled</name> <value>true</value> </property> <!—hdfs目錄存在--> <property> <name>spark.eventLog.dir</name> <value>hdfs:///hive_on_sparklogs</value> </property> <property> <name>spark.executor.memory</name> <value>512m</value> </property> <property> <name>spark.serializer</name> <value>org.apache.spark.serializer.KryoSerializer</value> </property> </configuration> |
5. 啟動hive
sethive.execution.engine=spark;
set spark.master=yarn-cluster;(目前只有這個測試有效)
(也可以配置在hive-site.xml中)
遇到問題:http://91r.net/ask/31228420.html