HUE整合HDFS MR
阿新 • • 發佈:2018-12-24
HUE(HadoopUser Experience)管理工具HUE是一個開源的HadoopUl系統,它基於PythonWEB框架實現,通過使用HUE我們可以在瀏覽器端的Web控制檯上與Hadoop叢集進行互動來分析處理資料。
官網下載頁面 http://gethue.com/category/release/
環境與軟體
系統:CentOS 6.5 三臺 搭建hadoop叢集
軟體:hue-3.7.0-cdh5.3.6.tar.gz
mini01 | mini02 | mini03 |
---|---|---|
NameNode | SecondaryNameNode | |
DataNode | DataNode | DataNode |
ResourceManager | JobHistoryServer | |
NodeManager | NnodeManager | NodeManager |
HUE |
1.準備環境依賴
這裡整理好了,可以直接yum安裝
[[email protected] ~]$ sudo yum install -y ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel gmp-devel
2.解壓HUE
[[email protected] tools]$ tar -zxvf hue-3.7.0-cdh5.3.6.tar.gz -C .. /install
3.編譯HUE
[[email protected] tools]$ cd ../install/hue-3.7.0-cdh5.3.6/
[[email protected] hue-3.7.0-cdh5.3.6]$ make apps
4.配置HUE
修改Hue.ini檔案
路徑:/home/hadoop/install/hue-3.7.0-cdh5.3.6/desktop/conf/hue.ini
修改內容參照如下
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o
http_host=mini01
http_port=8888
time_zone=Asia/Shanghai
# Webserver runs as this user
server_user=hue
server_group=hue
與HDFS整合按照如下配置
[[hdfs_clusters]]
# HA support by using HttpFs
#hdfs如果配置了高可用,則需要使用hffpFs
[[[default]]] #我沒有配置高可用所以埠是9000,如果高可用則是8020
# Enter the filesystem uri
fs_defaultfs=hdfs://mini01:9000
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
## webhdfs_url=http://localhost:50070/webhdfs/v1
#這裡如果配置了高可用,那麼埠就是14000
webhdfs_url=http://mini01:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
##security_enabled=false
# Default umask for file and directory creation, specified in an octal value.
## umask=022
# Directory of the Hadoop configuration
## hadoop_conf_dir=$HADOOP_CONF_DIR when set or '/etc/hadoop/conf'
#配置hadoop的一些配置檔案路徑
hadoop_conf_dir=/home/hadoop/install/hadoop-2.5.0-cdh5.3.6/etc/hadoop
hadoop_hdfs_home=/home/hadoop/install/hadoop-2.5.0-cdh5.3.6
hadoop_bin=/home/hadoop/install/hadoop-2.5.0-cdh5.3.6/bin
與YARN的整合相關配置如下
[[yarn_clusters]]
[[[default]]]
# Enter the host on which you are running the ResourceManager
## resourcemanager_host=localhost
resourcemanager_host=mini01
# The port where the ResourceManager IPC listens on
## resourcemanager_port=8032
resourcemanager_port=8032
# Whether to submit jobs to this cluster
submit_to=True
# Resource Manager logical name (required for HA)
## logical_name=
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
# URL of the ResourceManager API
## resourcemanager_api_url=http://localhost:8088
resourcemanager_api_url=http://mini01:8088
# URL of the ProxyServer API
## proxy_api_url=http://localhost:8088
# URL of the HistoryServer API
## history_server_api_url=http://localhost:19888
#歷史伺服器
history_server_api_url=http://mini02:19888
# In secure mode (HTTPS), if SSL certificates from Resource Manager's
# Rest Server have to be verified against certificate authority
## ssl_cert_ca_verify=False
# HA support by specifying multiple clusters
# e.g.
#配置HA高可用
# [[[ha]]]
# Resource Manager logical name (required for HA)
## logical_name=my-rm-name
5.配置Hadoop的配置檔案
5.1 core-site.xml
<property>
<!--在任何地方代理hue使用者-->
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
5.2 hdfs-site.xml
<!--Enable WebHDFS (REST API) in Namenodes and Datanodes-->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!--關閉許可權檢查-->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
5.3 httpfs-site.xml
<property>
<name>httpfs.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>httpfs.proxyuser.hue.groups</name>
<value>*</value>
</property>
<!--以上兩個屬性主要用於HUE服務與Hadoop服務不在同一臺節點上所必須的配置。
提示:
* 如果沒有配置NameNode的HA,HUE可以用WebHDFS來管理HDFS
* 如果配置了NameNodeHA,則HUE只可用HttpFS來管理HDFS-->
5.4 分發hadoop配置檔案.
xsync install/hadoop-2.5.0-cdh5.3.6/etc/hadoop
xsync同步指令碼,程式碼如下:
#!/bin/bash
#1 獲取輸入引數個數,如果沒有引數,直接退出
pcount=$#
if((pcount==0));then
echo no args;
exit;
fi
#2 獲取檔名稱
p1=$1 fname=`basename $p1`
echo fname=$fname
#3 獲取上級目錄到絕對路徑
pdir=`cd -P $(dirname $p1); pwd`
echo pdir=$pdir
#4 獲取當前使用者名稱稱
user=`whoami`
#5 迴圈
for((host=1; host<4; host++)); do
echo $pdir/$fname [email protected]$host:$pdir
echo --------------- mini0$host ----------------
rsync -rvl $pdir/$fname [email protected]$host:$pdir
done
5.5 啟動httpfs服務
[[email protected] install]$ ~/install/hadoop-2.5.0-cdh5.3.6/sbin/httpfs.sh start
6 測試
6.1 啟動HDFS
[[email protected] install]$ start-dfs.sh
6.2 啟動YARN
[[email protected] install]$ start-yarn.sh
6.3 啟動HUB服務
[[email protected] install]$ ~/install/hue-3.7.0-cdh5.3.6/build/env/bin/supervisor
7.結果
[[email protected] sbin]$ xcall.sh jps
============= mini01 jps =============
1344 DataNode
1602 ResourceManager
1250 NameNode
1701 NodeManager
2045 Jps
============= mini02 jps =============
1635 NodeManager
1848 Jps
1530 DataNode
============= mini03 jps =============
1302 NodeManager
1448 Jps
1197 SecondaryNameNode
1134 DataNode
[[email protected] sbin]$
啟動HUE服務出現如下提示,表示啟動成功
[INFO] Not running as root, skipping privilege drop
starting server with options {'ssl_certificate': None, 'workdir': None, 'server_name': 'localhost', 'host': '192.168.13.128', 'daemonize': False, 'threads': 10, 'pidfile': None, 'ssl_private_key': None, 'server_group': 'hue', 'ssl_cipher_list': 'DEFAULT:!aNULL:!eNULL:!LOW:!EXPORT:!SSLv2', 'port': 8888, 'server_user': 'hue'}
測試一下 開啟mini01的8888 WEB頁面
第一次登陸需要建立帳戶,我們建立為admin 密碼admin
登入成功後,介面如下:
我們選擇右上角的 File Browser 來管理HDFS平臺的檔案.可以上傳,下載,檢視檔案內容等操作
YARN的管理在 Job Browser 下.