1. 程式人生 > >大數據概述 Hadoop配置

大數據概述 Hadoop配置

examples tracing interact req .cn ctu dom ins cli

Top

NSD ARCHITECTURE DAY05

  1. 案例1:安裝Hadoop
  2. 案例2:安裝配置Hadoop

1 案例1:安裝Hadoop

1.1 問題

本案例要求安裝單機模式Hadoop:

  • 單機模式安裝Hadoop
  • 安裝JAVA環境
  • 設置環境變量,啟動運行

1.2 步驟

實現此案例需要按照如下步驟進行。

步驟一:環境準備

1)配置主機名為nn01,ip為192.168.1.21,配置yum源(系統源)

備註:由於在之前的案例中這些都已經做過,這裏不再重復,不會的學員可以參考之前的案例

2)安裝java環境

  1. [[email protected] ~]# yum -y install java-1.8.0-openjdk-devel
  2. [[email protected] ~]# java -version
  3. openjdk version "1.8.0_131"
  4. OpenJDK Runtime Environment (build 1.8.0_131-b12)
  5. OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)
  6. [[email protected] ~]# jps
  7. 1235 Jps

3)安裝hadoop

  1. [[email protected] ~]# tar -xf hadoop-2.7.6.tar.gz
  2. [[email protected] ~]# mv hadoop-2.7.6 /usr/local/hadoop
  3. [[email protected] ~]# cd /usr/local/hadoop/
  4. [[email protected] hadoop]# ls
  5. bin include libexec NOTICE.txt sbin
  6. etc lib LICENSE.txt README.txt share
  7. [[email protected] hadoop]# ./bin/hadoop //報錯,JAVA_HOME沒有找到
  8. Error: JAVA_HOME is not set and could not be found.
  9. [[email protected] hadoop]#

4)解決報錯問題

  1. [[email protected] hadoop]# rpm -ql java-1.8.0-openjdk
  2. [[email protected] hadoop]# cd ./etc/hadoop/
  3. [[email protected] hadoop]# vim hadoop-env.sh
  4. 25 export \
  5. JAVA_HOME="/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64/jre"
  6. 33 export HADOOP_CONF_DIR="/usr/local/hadoop/etc/hadoop"
  7. [[email protected] ~]# cd /usr/local/hadoop/
  8. [[email protected] hadoop]# ./bin/hadoop
  9. Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
  10. CLASSNAME run the class named CLASSNAME
  11. or
  12. where COMMAND is one of:
  13. fs run a generic filesystem user client
  14. version print the version
  15. jar <jar> run a jar file
  16. note: please use "yarn jar" to launch
  17. YARN applications, not this command.
  18. checknative [-a|-h] check native hadoop and compression libraries availability
  19. distcp <srcurl> <desturl> copy file or directories recursively
  20. archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  21. classpath prints the class path needed to get the
  22. credential interact with credential providers
  23. Hadoop jar and the required libraries
  24. daemonlog get/set the log level for each daemon
  25. trace view and modify Hadoop tracing settings
  26. Most commands print help when invoked w/o parameters.
  27. [[email protected] hadoop]# mkdir /usr/local/hadoop/aa
  28. [[email protected] hadoop]# ls
  29. bin etc include lib libexec LICENSE.txt NOTICE.txt aa README.txt sbin share
  30. [[email protected] hadoop]# cp *.txt /usr/local/hadoop/aa
  31. [[email protected] hadoop]# ./bin/hadoop jar \
  32. share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar wordcount aa bb //wordcount為參數 統計aa這個文件夾,存到bb這個文件裏面(這個文件不能存在,要是存在會報錯,是為了防止數據覆蓋)
  33. [[email protected] hadoop]# cat bb/part-r-00000 //查看

2 案例2:安裝配置Hadoop

2.1 問題

本案例要求:

  • 另備三臺虛擬機,安裝Hadoop
  • 使所有節點能夠ping通,配置SSH信任關系
  • 節點驗證

2.2 方案

準備四臺虛擬機,由於之前已經準備過一臺,所以只需再準備三臺新的虛擬機即可,安裝hadoop,使所有節點可以ping通,配置SSH信任關系,如圖-1所示:

技術分享圖片

圖-1

2.3 步驟

實現此案例需要按照如下步驟進行。

步驟一:環境準備

1)三臺機器配置主機名為node1、node2、node3,配置ip地址(ip如圖-1所示),yum源(系統源)

2)編輯/etc/hosts(四臺主機同樣操作,以nn01為例)

  1. [[email protected] ~]# vim /etc/hosts
  2. 192.168.1.21 nn01
  3. 192.168.1.22 node1
  4. 192.168.1.23 node2
  5. 192.168.1.24 node3

3)安裝java環境,在node1,node2,node3上面操作(以node1為例)

  1. [[email protected] ~]# yum -y install java-1.8.0-openjdk-devel

4)布置SSH信任關系

  1. [[email protected] ~]# vim /etc/ssh/ssh_config //第一次登陸不需要輸入yes
  2. Host *
  3. GSSAPIAuthentication yes
  4. StrictHostKeyChecking no
  5. [[email protected] .ssh]# ssh-keygen
  6. Generating public/private rsa key pair.
  7. Enter file in which to save the key (/root/.ssh/id_rsa):
  8. Enter passphrase (empty for no passphrase):
  9. Enter same passphrase again:
  10. Your identification has been saved in /root/.ssh/id_rsa.
  11. Your public key has been saved in /root/.ssh/id_rsa.pub.
  12. The key fingerprint is:
  13. SHA256:Ucl8OCezw92aArY5+zPtOrJ9ol1ojRE3EAZ1mgndYQM [email protected]
  14. The key‘s randomart image is:
  15. +---[RSA 2048]----+
  16. | o*E*=. |
  17. | +XB+. |
  18. | ..=Oo. |
  19. | o.+o... |
  20. | .S+.. o |
  21. | + .=o |
  22. | o+oo |
  23. | o+=.o |
  24. | o==O. |
  25. +----[SHA256]-----+
  26. [[email protected] .ssh]# for i in 21 22 23 24 ; do ssh-copy-id 192.168.1.$i; done
  27. //部署公鑰給nn01,node1,node2,node3

5)測試信任關系

  1. [[email protected] .ssh]# ssh node1
  2. Last login: Fri Sep 7 16:52:00 2018 from 192.168.1.21
  3. [[email protected] ~]# exit
  4. logout
  5. Connection to node1 closed.
  6. [[email protected] .ssh]# ssh node2
  7. Last login: Fri Sep 7 16:52:05 2018 from 192.168.1.21
  8. [[email protected] ~]# exit
  9. logout
  10. Connection to node2 closed.
  11. [[email protected] .ssh]# ssh node3

步驟二:配置hadoop

1)修改slaves文件

  1. [[email protected] ~]# cd /usr/local/hadoop/etc/hadoop
  2. [[email protected] hadoop]# vim slaves
  3. node1
  4. node2
  5. node3

2)hadoop的核心配置文件core-site

  1. [[email protected] hadoop]# vim core-site.xml
  2. <configuration>
  3. <property>
  4. <name>fs.defaultFS</name>
  5. <value>hdfs://nn01:9000</value>
  6. </property>
  7. <property>
  8. <name>hadoop.tmp.dir</name>
  9. <value>/var/hadoop</value>
  10. </property>
  11. </configuration>
  12. [[email protected] hadoop]# mkdir /var/hadoop //hadoop的數據根目錄
  13. [[email protected] hadoop]# ssh node1 mkdir /var/hadoop
  14. [[email protected] hadoop]# ssh node2 mkdir /var/hadoop
  15. [[email protected] hadoop]# ssh node3 mkdir /var/hadoop

3)配置hdfs-site文件

  1. [[email protected] hadoop]# vim hdfs-site.xml
  2. <configuration>
  3. <property>
  4. <name>dfs.namenode.http-address</name>
  5. <value>nn01:50070</value>
  6. </property>
  7. <property>
  8. <name>dfs.namenode.secondary.http-address</name>
  9. <value>nn01:50090</value>
  10. </property>
  11. <property>
  12. <name>dfs.replication</name>
  13. <value>2</value>
  14. </property>
  15. </configuration>

4)同步配置到node1,node2,node3

  1. [[email protected] hadoop]# yum –y install rsync //同步的主機都要安裝rsync
  2. [[email protected] hadoop]# for i in 22 23 24 ; do rsync -aSH --delete /usr/local/hadoop/
  3. \ 192.168.1.$i:/usr/local/hadoop/ -e ‘ssh‘ & done
  4. [1] 23260
  5. [2] 23261
  6. [3] 23262

5)查看是否同步成功

  1. [[email protected] hadoop]# ssh node1 ls /usr/local/hadoop/
  2. bin
  3. etc
  4. include
  5. lib
  6. libexec
  7. LICENSE.txt
  8. NOTICE.txt
  9. bb
  10. README.txt
  11. sbin
  12. share
  13. aa
  14. [[email protected] hadoop]# ssh node2 ls /usr/local/hadoop/
  15. bin
  16. etc
  17. include
  18. lib
  19. libexec
  20. LICENSE.txt
  21. NOTICE.txt
  22. bb
  23. README.txt
  24. sbin
  25. share
  26. aa
  27. [[email protected] hadoop]# ssh node3 ls /usr/local/hadoop/
  28. bin
  29. etc
  30. include
  31. lib
  32. libexec
  33. LICENSE.txt
  34. NOTICE.txt
  35. bb
  36. README.txt
  37. sbin
  38. share
  39. aa

步驟三:格式化

  1. [[email protected] hadoop]# cd /usr/local/hadoop/
  2. [[email protected] hadoop]# ./bin/hdfs namenode -format //格式化 namenode
  3. [[email protected] hadoop]# ./sbin/start-dfs.sh //啟動
  4. [[email protected] hadoop]# jps //驗證角色
  5. 23408 NameNode
  6. 23700 Jps
  7. 23591 SecondaryNameNode
  8. [[email protected] hadoop]# ./bin/hdfs dfsadmin -report //查看集群是否組建成功
  9. Live datanodes (3): //有三個角色成功

大數據概述 Hadoop配置