阿里雲伺服器偽分散式hadoop安裝
CentOS 7.3 64位
jdk 1.8.0_40
hadoop 2.6.5
1、在伺服器上安裝jdk
下載jdk-8u40-linux-x64.gz,解壓後配置Java環境變數
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_40 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
PATH=$JAVA_HOME/bin
把這些新增到etc/profile 中,並更新:source profile
java -version如果顯示java版本則安裝成功
2、在伺服器上安裝hadoop,下載hadoop解壓後配置hadoop環境變數(opt/hadoop是hadoop解壓目錄)
export HADOOP_HOME=/opt/hadoop export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_YARN_HOME=$HADOOP_HOME export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=${HADOOP_INSTALL}/lib:${HADOOP_INSTALL}/lib/native" export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOOME/sbin:$HADOOP_HOME/lib PATH=$JAVA_HOME/bin:$PATH:$HOME/bin:$HADOOP_HOME/bin
把這些新增到etc/profile 中,並更新:source profile
hadoop version如果顯示java版本則安裝成功
3、配置hadoop
想要外網也能訪問hdfs,配置的時候就不要用localhost,內網IP是什麼,就用什麼,否則雖然用localhost也能搭建成功,但沒辦法在外網上訪問,如果你在windows下寫的Java程式也無法訪問hdfs。
修改core_site.xml xxx.xxx.xxx.xxx是你的伺服器的內網IP
<configuration> <property>
<name>fs.defaultFS</name>
<value>hdfs://xxx.xxx.xxx.xxx:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/hadoop/tmp</value>
</property>
</configuration>
修改hdfs-site.xml
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/hadoop/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>xxx.xxx.xxx.xxx:9001</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration>
修改mapred-site.xml
<configuration> <property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>xxx.xxx.xxx.xxx:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>xxx.xxx.xxx.xxx:19888</value>
</property> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
</configuration>
修改yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>xxx.xxx.xxx.xxx:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>xxx.xxx.xxx.xxx:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>xxx.xxx.xxx.xxx:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>xxx.xxx.xxx.xxx:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>xxx.xxx.xxx.xxx:8088</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>6078</value> </property> </configuration>
在slavers裡刪除localhost,新增內網IPxxx.xxx.xxx.xxx
在hadoop-env.sh裡新增
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_40 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib/native"
完成。
hadoop namenode -format 格式化
在hadoop/sbin目錄下 sh start-all.sh
jps檢視
26277 DataNode 26696 NodeManager 26409 SecondaryNameNode 30109 Jps 26606 ResourceManager 26191 NameNode