Mac基於docker的hadoop單機環境搭建
1. 下載docker.dmg
2. 執行docker pull拉去centos映象
docker pull centos:centos7
docker run -it centos:centos7 /bin/bash
3. 建立hadoop使用者4. 把wget,vim,sudo,telnet,openssl server和client還有initscripts都要記得裝上
5. 下載hadoop
wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.9.1.tar.gz
6. 下載jdk8wget wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u91-b14/jdk-8u91-linux-x64.tar.gz
7. 解壓安裝後格式化HDFS:
遇到第一個問題:
[[email protected] hadoop]# bin/hadoop namenode-format
Error: Could not find or load main class namenode-format
原因是namenode後面是有個空格的。。。之後遇到ssh無法連線問題,yum了openssl以及
yum install initscripts #用以解決functions找不到問題
8. 配置了ssh免密碼登入
ssh-keygen -t rsa
cat id_rsa.pub >> authorized_keys
9. 修改配置檔案: /etc/hosts,
修改slaves
localhost
yarn001
修改yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
修改hadoop-env.sh
export JAVA_HOME=/home/hadoop/jdk1.8.0_91
修改mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://yarn001:9001</value>
</property>
修改core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://yarn001:8020</value>
</property>
修改hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://yarn001:8020</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/etc/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/etc/hadoop/dfs/data</value>
</property>
10. 啟動hdfs和yarn
確保sshd已經啟動
sbin/start-all.sh
[[email protected] hadoop]# jps
1536 SecondaryNameNode
2356 Jps
1189 NameNode
2281 DataNode
1771 NodeManager
779 ResourceManager
11. 測試一個mapreduce作業
[[email protected] hadoop]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.9.1.jar pi 2 100
Number of Maps = 2
Samples per Map = 100
17/01/30 16:21:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
17/01/30 16:21:46 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/01/30 16:21:47 INFO input.FileInputFormat: Total input paths to process : 2
17/01/30 16:21:47 INFO mapreduce.JobSubmitter: number of splits:2
17/01/30 16:21:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1485792264282_0001
17/01/30 16:21:48 INFO impl.YarnClientImpl: Submitted application application_1485792264282_0001
17/01/30 16:21:48 INFO mapreduce.Job: The url to track the job: http://58d9fc9eb3de:8088/proxy/application_1485792264282_0001/
17/01/30 16:21:48 INFO mapreduce.Job: Running job: job_1485792264282_0001
17/01/30 16:21:56 INFO mapreduce.Job: Job job_1485792264282_0001 running in uber mode : false
17/01/30 16:21:56 INFO mapreduce.Job: map 0% reduce 0%
17/01/30 16:22:03 INFO mapreduce.Job: map 50% reduce 0%
17/01/30 16:22:04 INFO mapreduce.Job: map 100% reduce 0%
17/01/30 16:22:12 INFO mapreduce.Job: map 100% reduce 100%
17/01/30 16:22:12 INFO mapreduce.Job: Job job_1485792264282_0001 completed successfully
17/01/30 16:22:12 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=50
FILE: Number of bytes written=352557
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=522
HDFS: Number of bytes written=215
HDFS: Number of read operations=11
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=10732
Total time spent by all reduces in occupied slots (ms)=5086
Total time spent by all map tasks (ms)=10732
Total time spent by all reduce tasks (ms)=5086
Total vcore-seconds taken by all map tasks=10732
Total vcore-seconds taken by all reduce tasks=5086
Total megabyte-seconds taken by all map tasks=10989568
Total megabyte-seconds taken by all reduce tasks=5208064
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=36
Map output materialized bytes=56
Input split bytes=286
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=56
Reduce input records=4
Reduce output records=0
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=247
CPU time spent (ms)=1700
Physical memory (bytes) snapshot=611356672
Virtual memory (bytes) snapshot=7846543360
Total committed heap usage (bytes)=489684992
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=236
File Output Format Counters
Bytes Written=97
Job Finished in 25.87 seconds
Estimated value of Pi is 3.12000000000000000000