Hadoop高可用叢集搭建
技術標籤:大資料hadoop大資料hdfsmapreducezookeeper
一、HDFS-HA叢集配置
1.1 配置HDFS-HA叢集
1.官方地址:http://hadoop.apache.org/
2.HDFS高可用叢集規劃,請先搭建好一個Hadoop完全分散式叢集(可以未進行namenode格式化)和ZooKeeper完全分散式環境已經安裝完成。
Hadoop102 Hadoop103 Hadoop104
NameNode NameNode
ResourceManager ResourceManager
ZKFC ZKFC
DataNode DataNode DataNode
JournalNode JournalNode JournalNode
ZooKeeper ZooKeeper ZooKeeper
3.在hadoop102配置core-site.xml
4.在hadoop102配置hdfs-site.xml dfs.nameservices mycluster<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <!-- 指定hadoop執行時產生檔案的儲存目錄 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/install/hadoop/data/tmp</value> </property>
5.拷貝配置好的hadoop環境到其他節點。 scp core-site.xml [email protected]:$PWD scp hdfs-site.xml [email protected]:$PWD 1.2 啟動HDFS-HA叢集 1.在各個JournalNode節點上,輸入以下命令啟動journalnode服務: $HADOOP_HOME/sbin/hadoop-daemon.sh start journalnode 2.在[nn1]上,對其進行格式化,並啟動(如果之前已經格式化過,此處格式化會導致namenode和datanode VRESION中的clusterID不一致,進而導致datanode無法啟動。解決方案:修改datanode中的clusterID與namenode中的clusterID相同): $HADOOP_HOME/bin/hdfs namenode -format $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode 3.在[nn2]上,同步nn1的元資料資訊: $HADOOP_HOME/bin/hdfs namenode -bootstrapStandby 4.啟動[nn2]: $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode 5.檢視web頁面顯示<!-- 叢集中NameNode節點都有哪些,這裡是nn1和nn2 --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <!-- nn1的RPC通訊地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop102:9000</value> </property> <!-- nn2的RPC通訊地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop103:9000</value> </property> <!-- nn1的http通訊地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop102:50070</value> </property> <!-- nn2的http通訊地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop103:50070</value> </property> <!-- 指定NameNode元資料在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop102:8485;hadoop103:8485;hadoop104:8485/mycluster</value> </property> <!-- 配置隔離機制,即同一時刻只能有一臺伺服器對外響應,多個機制用換行分割,即每個機制佔用一行 --> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <!-- 使用隔離機制時需要ssh無祕鑰登入--> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <!-- 宣告journalnode伺服器儲存目錄--> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/install/hadoop/data/jn</value> </property> <!-- 關閉許可權檢查--> <property> <name>dfs.permissions.enable</name> <value>false</value> </property> <!-- 訪問代理類:client,mycluster,active配置失敗自動切換實現方式--> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property>
6.在[nn1]上,啟動所有datanode
$HADOOP_HOME/sbin/hadoop-daemons.sh start datanode
7.將[nn1]切換為Active
$HADOOP_HOME/bin/hdfs haadmin -transitionToActive nn1
8.檢視是否Active
$HADOOP_HOME/bin/hdfs haadmin -getServiceState nn1
1.3 配置HDFS-HA自動故障轉移
1.具體配置
(1)在hdfs-site.xml中增加
dfs.ha.automatic-failover.enabled
true
(2)在core-site.xml檔案中增加
ha.zookeeper.quorum
hadoop102:2181,hadoop103:2181,hadoop104:2181
2.啟動
(1)關閉所有HDFS服務:
stop-dfs.sh
(2)啟動Zookeeper叢集:
zkServer.sh start
(3)初始化HA在Zookeeper中狀態:
hdfs zkfc -formatZK
(4)啟動HDFS服務:
start-dfs.sh
(5)在各個NameNode節點上啟動DFSZK Failover Controller,先在哪臺機器啟動,哪個機器的NameNode就是Active NameNode(預設start-dfs.sh已經自動啟動,以下是單獨啟動的命令)
hadoop-daemon.sh start zkfc
3.驗證
(1)將Active NameNode程序kill
kill -9 namenode的程序id
(2)將Active NameNode機器斷開網路
service network stop
二、YARN-HA配置
2.1 配置YARN-HA叢集
1.規劃叢集
hadoop102 hadoop103
ResourceManager ResourceManager
2.具體配置
(1)yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--啟用resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--宣告兩臺resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop102</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop103</value>
</property>
<!--指定zookeeper叢集的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<!--啟用自動恢復-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的狀態資訊儲存在zookeeper叢集-->
<property>
<name>yarn.resourcemanager.store.class</name>
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore