centos7+hadoop 2.8 的多節點叢集搭建
1、叢集IP
192.168.2.218 hadoop-slave-1
192.168.2.4 hadoop-master
2、java 選用自帶的java 1.7.0. openjdk
關於java版本和hadoop版本的搭配可以參考hadoop官方wiki
https://wiki.apache.org/hadoop/HadoopJavaVersions
3、配置/etc/proflie
JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME PATH CLASSPATH
4、建立hadoop 賬戶並設計密碼
useradd -m hadoop
passwd hadoop
5、配置免祕鑰登入
切換到hadoop使用者
ssh-keygen
-t rsa 一路回車
在hadoop 家目錄下(/home/hadoop)多出 .ssh/ 資料夾,裡面包含以下檔案
config id_rsa
id_rsa.pub known_hosts
然後 cat id_rsa.pub >authorized_keys
chmod 644 authorized_keys
然後將.ssh copy到 hadoop-slave-1 hadoop-slave-2的資料節點下。驗證免密登入是否成功
6、下載 hadoop-2.8.0 到家目錄
7.解壓到當前目錄 tar -zxvf hadoop-2.8.0.tar.gz
8、修改配置檔案:
9、core-site.xml 的配置如下:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.2.4:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
10、hdfs-site.xml配置如下
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/tmp</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data0,file:/data1,file:/data2,file:/data3,file:/data4,file:/data5,file:/data6,file:/data7,file:/data8,file:/data9,file:/data10,file:/data11,file:/data12,file:/data13,file:/data14,file:/data15,file:/data16,file:/data17,file:/data18,file:/data19,file:/data20,file:/data21,file:/data22,file:/data23,file:/data24,file:/data25,file:/data26,file:/data27,file:/data28,file:/data29,file:/data30,file:/data31,file:/data32,file:/data33,file:/data34,file:/data35</value>
//因為datanode 有多塊磁碟,此處填寫這些磁碟的路徑
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.2.4:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>
11、配置slave如下
[[email protected]
hadoop]$ cat slaves
192.168.2.218
12、配置mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.2.4:9001</value>
</property>
</configuration>
13、配置hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk
export HADOOP_CONF_DIR=/home/hadoop/hadoop-2.8.0/etc/hadoop
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true
-Dsun.security.krb5.debug=true -Dsun.security.spnego.debug"
14、格式化檔案
bin/hadoop namenode -format
15、啟動叢集:
./sbin/start-all.sh
停止叢集使用 ./sbin/stop-all.sh
16、檢視叢集狀態
http://192.168.2.4:8088
http://192.168.2.4:50070
17、向HDFS copy檔案測試