Spark 操作Hive 流程
1.ubuntu 裝mysql
2.進入mysql:
3.mysql>create database hive (這個將來是存 你在Hive中建的數據庫以及表的信息的(也就是元數據))mysql=》hive 這裏不存具體數值
4.mysql> grant all on *.* to [email protected] identified by ‘hive‘ #將所有數據庫的所有表的所有權限賦給hive用戶,後面的hive是配置hive-site.xml中配置的連接密碼
5.mysql> flush privileges; #刷新mysql系統權限關系表
要啟動hive 需先啟動hadoop,因為Hive是基於Hadoop的數據倉庫,使用HiveQL語言撰寫的查詢語句,最終都會被Hive自動解析成MapReduce任務由Hadoop去具體執行,因此,需要啟動Hadoop,然後再啟動Hive
start-dfs.sh (hadoop)
hive (這裏你在~/.bashrc中配過hive,可以直接在shell中這樣寫)
6.都成功的話,在hive建數據庫,create database if not exists hive;
use hive;
7.hive 建表:
hive> create table if not exists student(
> id int,
> name string,
> gender string,
> age int);
8.插入數據:insert into student values(1,‘xiaodou‘,‘B‘,28);
9.select * from student;
10.連接hive讀寫數據
11.cd /usr/loacl2/spark/conf
vim spark-env.sh:
export SPARK_DIST_CLASSPATH=$(/usr/local2/hadoop/bin/hadoop classpath)
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export CLASSPATH=$CLASSPATH:/usr/local2/hive/lib
export SCALA_HOME=/usr/local/scala
export HADOOP_CONF_DIR=/usr/local2/hadoop/etc/hadoop
export HIVE_CONF_DIR=/usr/local2/hive/conf
export SPARK_CLASSPATH=$SPARK_CLASSPATH:/usr/local2/hive/lib/mysql-connector-java-5.1.40-bin.jar 這裏並沒有起作用(以後再看原因吧)
12.為了讓Spark能夠訪問Hive,需要把Hive的配置文件hive-site.xml拷貝到Spark的conf目錄下
hive-site.xml:
(這個是在hive中自己建的源碼中沒有,記得將hive-default.xml.template重命名為hive-default.xml)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>
13. 這下你就可以順利的在spark-shell中操作hive
./spark-shell --driver-class-path /usr/local2/hive/lib/mysql-connector-java-5.1.44-bin.jar
Spark 操作Hive 流程