Hive 1.2.1 安裝測試
阿新 • • 發佈:2019-01-30
1 安裝MySQL
安裝伺服器:
sudo apt-get install mysql-server
(注意:期間需要設定root使用者的密碼喲)
安裝客戶端:
sudo apt-get install mysql-client
sudo apt-get install libmysqlclient-dev
檢查系統中是否有MySQL服務,出現下面的大概就是成功了
sudo netstat -tap|grep mysql
檢視資料庫字符集:
show variables like 'character%';
我們發現數據庫的client是utf8編碼,而server是latin1編碼,系統也是utf8編碼;而我們經常開發環境字符集為utf8,所以我們這裡修改server的編碼
設定資料庫編碼,開啟配置檔案/etc/mysql/mysql.conf.d/mysqld.cnf
修改新增:
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
#增加客戶端的字元編碼
default-character-set=utf8
[mysqld]
# 增加伺服器端的字元編碼
character-set-server=utf8
collation-server=utf8_general_ci
# 註釋bind-address
# bind-address = 127.0.0.1
檢視一下編碼格式:
show variables like 'character%';
2 在MySQL上建立一個hive資料庫,並在hive資料庫中建立表user
登入資料庫(root使用者)
mysql -u root -p
建立hive資料庫
create database hive;
使用hive資料庫
use hive;
在hive資料庫裡面建立user使用者表
create table user(Host char(20),User char(10),Password char(20));
3 進入MySQL,賦予許可權,密碼自己改
mysql -u root -p
insert into user(Host,User,Password) values("localhost","hive","hive");
FLUSH PRIVILEGES;
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost' IDENTIFIED BY 'hive';
FLUSH PRIVILEGES;
4 解壓hive安裝包之後,配置環境
#hive
export HIVE_HOME=/home/sendi/apache-hive-1.1.1-bin
export PATH=$PATH:$HIVE_HOME/bin
5 修改hive/conf下的幾個template模板
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml
6 配置hive-env.sh檔案,指定HADOOP_HOME
vim hive-env.sh
7 修改hive-site.xml檔案,指定MySQL資料庫驅動、資料庫名、使用者名稱及密碼
vim hive-site.xml
8 修改hive/bin下的hive-config.sh檔案,設定JAVA_HOME,HADOOP_HOME,HIVE_HOME
# Hadoop/Java Home --- 2016-12-23 hjr
export JAVA_HOME=/home/hjr/develop/java8/jdk1.8.0_101
export HADOOP_HOME=/home/hjr/develop/cluster/hadoop-2.6.5
export HIVE_HOME=/home/hjr/develop/apache-hive-1.2.1-bin
9 下載mysql-connector-java-5.1.40-bin.jar檔案,並放到$HIVE_HOME/lib目錄下
檢視一下:
10 在HDFS中建立/tmp和/user/hive/warehouse並設定許可權
hadoop fs -mkdir /tmp
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse
11 啟動hive
12 啟動時,可能會遇到下面的問題
經典錯誤:Found class jline.Terminal
[ERROR] Terminal initialization failed; falling back to unsupported
java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
at jline.TerminalFactory.create(TerminalFactory.java:101)
at jline.TerminalFactory.get(TerminalFactory.java:158)
at jline.console.ConsoleReader.<init>(ConsoleReader.java:229)
at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)
at jline.console.ConsoleReader.<init>(ConsoleReader.java:209)
at org.apache.hadoop.hive.cli.CliDriver.setupConsoleReader(CliDriver.java:787)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:721)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
at jline.console.ConsoleReader.<init>(ConsoleReader.java:230)
at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)
at jline.console.ConsoleReader.<init>(ConsoleReader.java:209)
at org.apache.hadoop.hive.cli.CliDriver.setupConsoleReader(CliDriver.java:787)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:721)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
13 Found class jline.Terminal:原因是hadoop目錄下存在老版本jline
解決方法:
(1)進入hive的lib目錄,把新版本的jline複製到hadoop的以下目錄
hjr@hjr:~/develop/apache-hive-1.2.1-bin/lib$ cp jline-2.12.jar /home/hjr/develop/cluster/hadoop-2.6.5/share/hadoop/yarn/lib
(2)把hadoop舊版本的jline刪掉
14 再次啟動Hive成功
問題(1):
Fri Dec 23 19:13:35 CST 2016 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
警告問題是說:使用SSL連線資料庫(高版本的MySQL資料庫)……
修改hive的hive-site.xml配置檔案,新增:&characterEncoding=UTF-8&useSSL=false,不使用SSL連線資料庫(此處還設定了連線字串的編碼格式)
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
15 Hive測試案例
1 測試資料
1::F::1::10::48067
2::M::56::16::70072
3::M::25::15::55117
4::M::45::7::02460
5::M::25::20::55455
6::F::50::9::55117
7::M::35::1::06810
8::M::25::12::11413
9::M::25::17::61614
10::F::35::1::95370
11::F::25::1::04093
12::M::25::12::32793
13::M::45::1::93304
14::M::35::0::60126
15::M::25::7::22903
16::F::35::0::20670
17::M::50::1::95350
18::F::18::3::95825
19::M::1::10::48073
20::M::25::14::55113
21::M::18::16::99353
22::M::18::15::53706
23::M::35::0::90049
24::F::25::7::10023
25::M::18::4::01609
26::M::25::7::23112
27::M::25::11::19130
28::F::25::1::14607
29::M::35::7::33407
30::F::35::7::19143
31::M::56::7::06840
32::F::25::0::19355
33::M::45::3::55421
34::F::18::0::02135
35::M::45::1::02482
36::M::25::3::94123
37::F::25::9::66212
38::F::18::4::02215
39::M::18::4::61820
40::M::45::0::10543
41::F::18::4::15116
42::M::25::8::24502
43::M::25::12::60614
44::M::45::17::98052
45::F::45::16::94110
46::M::18::19::75602
47::M::18::4::94305
48::M::25::4::92107
49::M::18::12::77084
50::F::25::2::98133
51::F::1::10::10562
52::M::18::4::72212
53::M::25::0::96931
54::M::50::1::56723
55::F::35::12::55303
56::M::35::20::60440
57::M::18::19::30350
58::M::25::2::30303
59::F::50::1::55413
60::M::50::1::72118
61::M::25::17::95122
62::F::35::3::98105
63::M::18::4::54902
64::M::18::1::53706
65::M::35::12::55803
66::M::25::18::57706
67::F::50::5::60181
68::M::18::4::53706
69::F::25::1::02143
70::M::18::4::53703
71::M::25::14::95008
72::F::45::0::55122
73::M::18::4::53706
74::M::35::14::94530
75::F::1::10::01748
76::M::35::7::55413
77::M::18::4::15321
78::F::45::1::98029
79::F::45::0::98103
80::M::56::1::49327
81::F::25::0::60640
82::M::25::17::48380
83::F::25::2::94609
84::M::18::4::53140
85::M::18::4::94945
86::F::1::10::54467
87::M::25::14::48360
88::F::45::1::02476
89::F::56::9::85749
90::M::56::13::85749
建立使用者表結構
CREATE TABLE users(UserID BigInt,Gender String,Age Int,Occupation String,Zipcode String) PARTITIONED BY (dt String) ROW FORMAT DELIMITED FIELDS TERMINATED BY '::';
load data inpath '/moviesDataSet/ml-1m/users.dat' into table users PARTITION(dt=20161223);
select count(1) from users;
附錄
Name node is in safe mode.
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/hjr/0cf97b04-c1d7-474b-9a0f-78dba3168259. Name node is in safe mode.
The reported blocks 104 has reached the threshold 0.9990 of total blocks 104. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 12 seconds.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1366)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4258)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4233)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:853)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/hjr/0cf97b04-c1d7-474b-9a0f-78dba3168259. Name node is in safe mode.
The reported blocks 104 has reached the threshold 0.9990 of total blocks 104. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 12 seconds.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1366)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4258)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4233)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:853)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
at org.apache.hadoop.ipc.Client.call(Client.java:1470)
at org.apache.hadoop.ipc.Client.call(Client.java:1401)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy15.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:539)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2742)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2713)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:870)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:866)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem