CDH6從0到1搭建
本文參考:
http://blog.51cto.com/pizibaidu/2174297
官方參考文件
https://www.cloudera.com/documentation/enterprise/6/6.0/topics/installation.html
一,環境準備
記憶體,磁碟足夠的主機,單節點記憶體不足10g,不推薦搭建。本次cdh叢集共有4個節點,每個64g記憶體,1t的磁碟。centos7.4作業系統。較快的網速,否則你會體會到什麼叫痛苦!
二,安裝包下載地址
1,Cloudera Manager
https://archive.cloudera.com/cm6/6.0.0/redhat7/yum/RPMS/x86_64/
cloudera-manager-agent-6.0.0-530873.el7.x86_64.rpm
cloudera-manager-daemons-6.0.0-530873.el7.x86_64.rpm
cloudera-manager-server-6.0.0-530873.el7.x86_64.rpm
cloudera-manager-server-db-2-6.0.0-530873.el7.x86_64.rpm
oracle-j2sdk1.8-1.8.0+update141-1.x86_64.rpm
2,CDH parcel包
https://archive.cloudera.com/cdh6/6.0.0/parcels/
CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel
CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha256
manifest.json
三,準備工作
1,新建虛擬機器,安裝作業系統
我此次用了一臺主機,記憶體256g,掛載磁碟8t。計劃新建4個虛擬機器,使用vsphere client客戶端工具。
2,虛擬機器節點劃分
主機名 | ip | 安裝專案 |
cdh1 | 172.16.x.x | Cloudera Manager Server |
Cloudera Manager Agent | ||
時間同步服務 | ||
JDK | ||
cdh2 | 172.16.x.x | Cloudera Manager Agent |
時間同步服務 | ||
JDK | ||
cdh3 | 172.16.x.x | Cloudera Manager Agent |
時間同步服務 | ||
JDK | ||
cdh4 | 172.16.x.x | Cloudera Manager Agent |
時間同步服務 | ||
JDK |
vi /etc/sysconfig/network-scripts/ifcfg-ens160
注:作業系統版本不一樣,具體網絡卡也不同,修改請參考,最終配置如下
TYPE=Ethernet
BOOTPROTO=static
DEFROUTE=yes
PEERDNS=yes
PEERROUTES=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens160
UUID=909980b6-7394-4cc1-bd97-bc2e16acb43b
DEVICE=ens160
ONBOOT=yes
//新增如下配置
//Vmware的網路配置
IPADDR=172.16.0.*** //虛擬機器ip
NETMASK=255.255.255.0 //子網掩碼
GATEWAY=172.16.0.254 //閘道器
//主機DNS
DNS1=114.114.114.114
NM_CONTROLLED=yes
配置完成,重啟網路服務
a、重啟服務
systemctl restart network.service
b、或者重啟主機
Ifconfig檢視是否更改
測試:
a,ping 自身ip
b,ping 閘道器
c,ping主機
d,ping 外網
ping通外網,說明ip配置成功
4,修改主機名(所有節點)
使用xshell工具連線虛擬機器。
臨時修改:
#hostname cdh1
永久修改:
#hostnamectl set-hostname cdh1
驗證:#hostname
5,修改ip和主機名對應關係(所有節點)
/etc/hosts檔案中新增:
#vi /etc/hosts
172.16.x.x cdh1
172.16.x.x cdh2
172.16.x.x cdh3
172.16.x.x cdh4
6,關閉SELINUX與防火牆(所有節點)
關閉selinux:
臨時關閉:
#setenforce 0
永久關閉(重啟生效)
#修改/etc/selinux/config,SELINUX=disabled
檢視SELINUX狀態:
#/usr/sbin/sestatus -v 或者 getenforce
關閉防火牆:
#systemctl stop firewalld
#systemctl disable firewalld
檢視防火牆狀態:
#firewall-cmd --state
7,免密配置
所有節點:
#cd
#mkdir .ssh
#ssh-keygen -t rsa (直接回車)
#ssh-keygen -t dsa(直接回車)
在cdh1節點:
#cd
#cat .ssh/id_rsa.pub >>.ssh/authorized_keys
#cat .ssh/id_dsa.pub >>.ssh/authorized_keys
#ssh cdh2 cat .ssh/id_rsa.pub >>.ssh/authorized_keys
#ssh cdh2 cat .ssh/id_dsa.pub >>.ssh/authorized_keys
#ssh cdh3 cat .ssh/id_rsa.pub >>.ssh/authorized_keys
#ssh cdh3 cat .ssh/id_dsa.pub >>.ssh/authorized_keys
#ssh cdh4 cat .ssh/id_rsa.pub >>.ssh/authorized_keys
#ssh cdh4 cat .ssh/id_dsa.pub >>.ssh/authorized_keys
#scp .ssh/authorized_keys cdh2:~/.ssh
#scp .ssh/authorized_keys cdh3:~/.ssh
#scp .ssh/authorized_keys cdh4:~/.ssh
驗證信任關係:
所有節點都要執行一邊
#ssh cdh1 date
#ssh cdh2 date
#ssh cdh3 date
#ssh cdh4 date
8,配置時間同步
所有節點配置ntp服務:
叢集中所有主機必須保持時間同步,如果時間相差較大會引起各種問題。 具體思路如下:
master節點作為ntp伺服器與外界對時中心同步時間,隨後對所有datanode節點提供時間同步服務。
所有datanode節點以master節點為基礎同步時間。
所有節點安裝相關元件:yum install ntp。
啟動服務: systemctl start ntpd
配置開機啟動:systemctl enable ntpd
配置NTP服務端:cdh1(172.16.0.111)
修改/etc/ntp.conf檔案,新增
restrict 172.16.0.0 mask 255.255.255.0
restrict 172.16.0.111 nomodify notrap noquery
server 127.127.1.0
fudge 127.127.1.0 stratum 1
server 172.16.0.111 perfer
NTP客戶端配置:cdh2,cdh3,cdh4
修改/etc/ntp.conf檔案,新增
restrict 172.16.0.111 nomodify notrap noquery
server 172.16.0.111
fudge 127.127.1.0 stratum 1
請求伺服器前,請先使用ntpdate手動同步一下時間:ntpdate -u cdh1 (主節點ntp伺服器).
NTP客戶端執行
#ntpdate cdh1
9,準備Parcels,用以安裝CDH6
將CHD6相關的Parcel包放到主節點的/opt/cloudera/parcel-repo/目錄中,如果沒有此目錄,可以自己建立。
CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel
CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha256
manifest.json
注意:最後將• CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha256,重新命名為• CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel.sha,這點必須注意否則,系統會重新下載• CDH-6.0.0-1.cdh6.0.0.p0.537114-el7.parcel檔案。
四,正式開始安裝
建議:正式安裝前,將基礎環境做一個映象,以便安裝失敗恢復。
1,安裝repo
#yum -y install wget(虛擬機器沒有wget命令執行)
#wget https://archive.cloudera.com/cm6/6.0.0/redhat7/yum/cloudera-manager.repo -P /etc/yum.repos.d/
2,匯入GPG key
#rpm --import https://archive.cloudera.com/cm6/6.0.0/redhat7/yum/RPM-GPG-KEY-cloudera
3, Install JDK
#yum install oracle-j2sdk1.8
4, yum安裝CM
#yum install cloudera-manager-server(前期只需要安裝這個)
cm安裝包有1g大小,請保證網速
5,Installing the MySQL Server
#wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
#rpm -ivh mysql-community-release-el7-5.noarch.rpm
#yum update
#yum install mysql-server
#systemctl start mysqld
#systemctl enable mysqld
Cloudera Manager必須使用innodb引擎。使用MyISAM服務無法啟動。
建議:mysql初始化前做映象。
初始化Mysql
#/usr/bin/mysql_secure_installation
執行後按以下步驟操作
[…]
Enter current password for root (enter for none):
OK, successfully used password, moving on…
[…]
Set root password? [Y/n] Y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
[…]
Disallow root login remotely? [Y/n] N
[…]
Remove test database and access to it [Y/n] Y
[…]
Reload privilege tables now? [Y/n] Y
All done!
修改/etc/my.cnf 檔案
[mysqld]
character-set-server=utf8
lower_case_table_names=1
max_connections = 550
log_bin=/var/lib/mysql/mysql_binary_log
binlog_format = mixed
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
重灌後使用官方推介配置:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
#Disabling symbolic-links is recommended to prevent assorted security risks;
#to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace ‘/var/lib/mysql/mysql_binary_log’ with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
#InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
6,Installing the MySQL JDBC Driver
#wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
#tar zxvf mysql-connector-java-5.1.46.tar.gz
#mkdir -p /usr/share/java/
#cd mysql-connector-java-5.1.46
#cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
7,Creating Databases for Cloudera Software
需要建的庫有
scm、amon、rman、hue、metastore、sentry、nav、navms、oozie
mysql命令列執行如下:
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO ‘scm’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO ‘amon’@’%’ IDENTIFIED BY ’ [email protected]’;
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO ‘rman’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO ‘hue’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO ‘metastore’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO ‘sentry’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO ‘nav’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO ‘navms’@’%’ IDENTIFIED BY ‘[email protected]’;
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO ‘oozie’@’%’ IDENTIFIED BY ‘[email protected]’;
8,Set up the Cloudera Manager Database
----執行#/opt/cloudera/cm/schema/scm_prepare_database.sh 可以檢視引數
執行完後,生成資料庫配置檔案:
#/etc/cloudera-scm-server/db.properties
9,本地安裝cm和資料庫在同一虛擬機器
#/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
Enter SCM password:(建立資料庫時密碼)
10,啟動cm服務
systemctl start cloudera-scm-server
11,檢視日誌
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
五,其他服務安裝
1,登入cm WEB介面
http://主機ip:7180/cmf/login 訪問CM
使用者名稱admin
密碼admin
遇到問題:7180服務沒有啟動
解決方法:
檢視埠服務,未啟動
cm服務啟動顯示正常。
我在剛啟動服務後7180沒有啟動,沒找到什麼原因,後來 晾了它一晚上,第二天一查埠,居然啟動了,可能是cm服務要啟動的東西太多,主機一時沒啟動。
啟動成功!
備註:
linux檢視埠:https://www.cnblogs.com/Archmage/p/7570716.html
2,具體安裝步驟
WELCOME
Accept License
Select Edition
版本選擇免費版,已經夠用。
Welcome (Add Cluster - Installation)
Specify Hosts
主機是自己規劃安裝agent的主機
Select Repository
JDK 安裝選項
Enter Login Credentials
Install Agents
最到考驗網速的時候了,該頁面使用js進行重新整理,千萬別手動重新整理,手動重新整理的話安裝列表中之前已經功成的會消失,未成功的顯示,未成功即使安裝成功了,cm會管理不到之前已經成功但重新整理後未顯示的主機,在安裝叢集時只能選擇本次顯示的(原因未知)。網速過慢的話安裝會失敗,一定要耐心等待,別做無關操作。
失敗重試直到成功,再次說明,耐心等待。
n次失敗之後終於安裝成功!
Install Parcels
安裝成功!等了有將近半天的時候。。。
Inspect Hosts
問題處理:
時鐘同步是由於我恢復過一次映象,在手動同步一下即可。
服務端啟動(恢復映象居然沒啟動)
各臺同步
虛擬記憶體設定:
Cloudera 建議將 /proc/sys/vm/swappiness 設定為最大值 10。當前設定為 30。使用 sysctl 命令在執行時更改該設定並編輯 /etc/sysctl.conf,以在重啟後儲存該設定。您可以繼續進行安裝,但 Cloudera Manager 可能會報告您的主機由於交換而執行狀況不良。以下主機將受到影響:
檢視詳細資訊
cdh[171-174]
解決:
臨時解決
通過echo 10 > /proc/sys/vm/swappiness即可解決。
永久解決
sysctl -w vm.swappiness=10
echo vm.swappiness = 10 >> /etc/sysctl.conf
透明大頁問題:
已啟用透明大頁面壓縮,可能會導致重大效能問題。請執行“echo never > /sys/kernel/mm/transparent_hugepage/defrag”和“echo never > /sys/kernel/mm/transparent_hugepage/enabled”以禁用此設定,然後將同一命令新增到 /etc/rc.local 等初始化指令碼中,以便在系統重啟時予以設定。以下主機將受到影響:
檢視詳細資訊
cdh[171-174]
處理:
大記憶體頁禁用
echo never>/sys/kernel/mm/transparent_hugepage/defrag
echo never>/sys/kernel/mm/transparent_hugepage/enabled
升級軟體依賴版本
Starting with CDH 6, PostgreSQL-backed Hue requires the Psycopg2 version to be at least 2.5.4, see the documentation for more information. This warning can be ignored if hosts will not run CDH 6, or will not run Hue with PostgreSQL. The following hosts have an incompatible Psycopg2 version of ‘2.5.1’:
檢視詳細資訊
cdh[171-174]
本次忽略
處理完成:
六,大資料元件安裝
正式安裝前拍個快照
Select Services
自定義角色分配
資料庫設定
這裡測試了一晚上。。。
稽核更改(可以自己更改目錄,我使用的是預設)
命令詳細資訊
Summary
cdh管理介面,大功告成:
第一次進入後,許多服務標紅,檢視後是agent與cm失去連線。
#ntpstat(所有節點)
只有cm啟動,其餘莫名停止。
#systemctl start ntpd(停止節點)
#ntpdate -u cdh1(ntp服務端)
在失去的節點 執行
service cloudera-scm-agent restart(停止節點)
重啟 正常啟動不報錯
注
service cloudera-scm-agent status(檢視 agent 狀態)