Hadoop筆記之十三——hue的安裝以及例項
一、Hue安裝
1、檢查聯網 [[email protected] ~]$ ping www.baidu.com PING www.a.shifen.com (115.239.210.27) 56(84) bytes of data. 64 bytes from 115.239.210.27: icmp_seq=1 ttl=128 time=6.49 ms 64 bytes from 115.239.210.27: icmp_seq=2 ttl=128 time=6.49 ms
2、安裝依賴包 yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel mysql-server
(rpm和yum命令都是安裝的rpm軟體包) /etc/yum.repos.d/
安裝mysql # yum -y install mysql mysql-devel mysql-server mysql --基本命令 mysql-server --mysql服務主程式包
3、安裝hue 解壓軟體包 make apps 修改hue.ini [desktop] secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o http_host=master http_port=8888 time_zone=Asia/Shanghai
4、啟動hue server $ build/env/bin/supervisor 關閉hue kill -9 `ps -ef|grep supervisor |grep -v 'grep' | awk '{print $2}'` kill -9 `netstat -antp|grep 8888|awk '{print $7}' |awk -F'/' '{print $1}'`
5、訪問地址 http://master:8888
二、Hue與Hadoop整合
在hadoop的core-site.xml <!-- Hue --> <property> <name>hadoop.proxyuser.hue.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hue.groups</name> <value>*</value> </property> 在hadoop的hdfs-site.xml <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property>
重啟服務HDFS
配置Hue.ini
[hadoop]
# Configuration for HDFS NameNode # ------------------------------------------------------------------------ [[hdfs_clusters]] # HA support by using HttpFs
[[[default]]] # Enter the filesystem uri fs_defaultfs=hdfs://hadoop-senior.ibeifeng.com:8020
# NameNode logical name. ## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism. # Domain should be the NameNode or HttpFs host. # Default port is 14000 for HttpFs. webhdfs_url=http://master:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured ## security_enabled=false
# Default umask for file and directory creation, specified in an octal value. ## umask=022
# Directory of the Hadoop configuration hadoop_conf_dir=/opt/modules/hadoop-2.5.0-cdh5.3.6/etc/hadoop hadoop_hdfs_home=/opt/modules/hadoop-2.5.0-cdh5.3.6/ hadoop_bin=/opt/modules/hadoop-2.5.0-cdh5.3.6/bin/
[[yarn_clusters]]
[[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=master
# The port where the ResourceManager IPC listens on resourcemanager_port=8032
# Whether to submit jobs to this cluster submit_to=True
# Resource Manager logical name (required for HA) ## logical_name=
# Change this if your YARN cluster is Kerberos-secured ## security_enabled=false
# URL of the ResourceManager API resourcemanager_api_url=http://master:8088
# URL of the ProxyServer API proxy_api_url=http://master:8088
# URL of the HistoryServer API history_server_api_url=http://master:19888
手動關閉hue程序 kill -9 `ps -ef|grep supervisor |grep -v 'grep' | awk '{print $2}'` kill -9 `netstat -antp|grep 8888|awk '{print $7}' |awk -F'/' '{print $1}'`
三、Hue與mysql整合
[[[mysql]]] # Name to show in the UI. nice_name="My SQL DB"
# For MySQL and PostgreSQL, name is the name of the database. # For Oracle, Name is instance of the Oracle server. For express edition # this is 'xe' by default. ##name=mysqldb
# Database backend to use. This can be: # 1. mysql # 2. postgresql # 3. oracle engine=mysql
# IP or hostname of the database to connect to. host=master
# Port the database server is listening to. Defaults are: # 1. MySQL: 3306 # 2. PostgreSQL: 5432 # 3. Oracle Express Edition: 1521 port=3306
# Username to authenticate with when connecting to the database. user=root # Password matching the username to authenticate with when # connecting to the database. password=123456
四、Hue與Hive整合
hive-site.xml <property> <name>hive.metastore.uris</name> <value>thrift://master:9083</value> </property> <property> <name>hive.server2.long.polling.timeout</name> <value>5000</value> </property>
啟動服務 $ bin/hive --service metastore & $ bin/hive --service hiveserver2 &
修改hue.ini [beeswax]
# Host where HiveServer2 is running. # If Kerberos security is enabled, use fully-qualified domain name (FQDN). hive_server_host=master
# Port where HiveServer2 Thrift server runs on. hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located hive_conf_dir=/opt/modules/hive-0.13.1-cdh5.3.6/conf
# Timeout in seconds for thrift calls to Hive service server_conn_timeout=120
五、Hue與oozie整合
hue.ini [liboozie] # The URL where the Oozie service runs on. This is required in order for # users to submit jobs. Empty value disables the config check. oozie_url=http://master:11000/oozie
# Requires FQDN in oozie_url if enabled ## security_enabled=false
# Location on HDFS where the workflows/coordinator are deployed when submitted. remote_deployement_dir=/user/beifeng/oozie-apps
[oozie] # Location on local FS where the examples are stored. local_data_dir=/opt/modules/oozie-4.0.0-cdh5.3.6/oozie-apps
# Location on local FS where the data for the examples is stored. sample_data_dir=/opt/modules/oozie-4.0.0-cdh5.3.6/oozie-apps
# Location on HDFS where the oozie examples and workflows are stored. remote_data_dir=/user/beifeng/oozie-apps
# Maximum of Oozie workflows or coodinators to retrieve in one API call. oozie_jobs_count=100
# Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit. enable_cron_scheduling=true oozie-site.xml <property> <name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name> <value>*</value> </property> <property> <name>oozie.service.ProxyUserService.proxyuser.hue.groups</name> <value>*</value> </property> <property> <name>oozie.processing.timezone</name> <value>UTC</value> </property>
$ bin/oozied.sh start
$ bin/hdfs dfs -chmod -R 777 /tmp
任務: 提交一個WordCount.jar