分散式資源排程框架 ——YARN

阿新 • • 發佈：2018-11-06

1 YARN 產生背景

MapReduce1.x 存在的問題：單點故障和節點壓力大不易擴充套件；
Hadoop1.x 時，MapReduce -> Master/Slave 架構，1個 JobTracker 帶多個 TaskTracker
JobTracker : 負責資源管理和作業排程
TaskTracker: 定期向 JT 彙報本節點的健康狀況、資源使用情況、作業執行情況；接受來自JT 的命令——啟動任務
YARN:不同計算框架可以共享同一個 HDFS 叢集上的資料，享受整體的資源排程

2 YARN 的架構

http://archive-primary.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0/hadoop-yarn/hadoop-yarn-site/YARN.html

ResourceManager:RM，整個叢集同一時間提供服務的RM只有一個，負責叢集資源的統一管理和排程，處理客戶端的請求——提交一個作業，殺死一個作業；監控NM,一旦某個NM掛了，那麼該 NM 上執行的任務需要告訴 AM;
NodeManager:NM，整個叢集有多個，負責本節點資源管理和使用，定時向 RM 彙報本節點的資源使用情況；接收並處理來自 RM 的各種命令：啟動 Container; 處理來自 AM 的命令；單個節點的資源管理
**ApplicationMaster **: AM,負責應用程式的管理，每個應用程式對應一個：MR,Spark;為應用程式向 RM 申請資源（core,memory）,分配給內部的 task;需要與 NM 通訊：啟動/停止 task,task 是執行在 container 裡面， AM也是執行在 container裡面；

Container：封裝了CPU,Memory 等資源的一個容器，是一個任務執行環境的抽象
Client：提交作業，檢視進度

3 YARN 環境搭建

3.1 `mapred-site.xml`

<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
</property>

3.2 `yarn-site.xml`

<property>
        <name> 
yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
 </property>

3.3 啟動 YARN

[[email protected] ~]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-node1.out
node1: starting nodemanager, logging to /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-node1.out

瀏覽器訪問 http://node1:8088
在這裡插入圖片描述

4 提交 MapReduce 作業到 YARN

自帶案例 /home/hadoop/apps/hadoop-2.6.0-cdh5.7.0/share/hadoop/mapreduce2

hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar

[[email protected] mapreduce2]$ hadoop jar 
RunJar jarFile [mainClass] args...

[[email protected] mapreduce2]$ hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar 
An example program must be given as the first argument.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
[[email protected] mapreduce2]$

[[email protected] mapreduce2]$ hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar pi
Usage: org.apache.hadoop.examples.QuasiMonteCarlo <nMaps> <nSamples>
Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

[[email protected] mapreduce2]$

[[email protected] mapreduce2]$ hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar pi 2 3
Number of Maps  = 2
Samples per Map = 3
18/10/29 22:19:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
18/10/29 22:19:02 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/10/29 22:19:03 INFO input.FileInputFormat: Total input paths to process : 2
18/10/29 22:19:04 INFO mapreduce.JobSubmitter: number of splits:2
18/10/29 22:19:04 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1540822729980_0001
18/10/29 22:19:04 INFO impl.YarnClientImpl: Submitted application application_1540822729980_0001
18/10/29 22:19:04 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1540822729980_0001/
18/10/29 22:19:04 INFO mapreduce.Job: Running job: job_1540822729980_0001
18/10/29 22:19:16 INFO mapreduce.Job: Job job_1540822729980_0001 running in uber mode : false
18/10/29 22:19:16 INFO mapreduce.Job:  map 0% reduce 0%
18/10/29 22:19:26 INFO mapreduce.Job:  map 50% reduce 0%
18/10/29 22:19:27 INFO mapreduce.Job:  map 100% reduce 0%
18/10/29 22:19:32 INFO mapreduce.Job:  map 100% reduce 100%
18/10/29 22:19:33 INFO mapreduce.Job: Job job_1540822729980_0001 completed successfully
18/10/29 22:19:33 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=50
		FILE: Number of bytes written=335472
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=522
		HDFS: Number of bytes written=215
		HDFS: Number of read operations=11
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=3
	Job Counters 
		Launched map tasks=2
		Launched reduce tasks=1
		Data-local map tasks=2
		Total time spent by all maps in occupied slots (ms)=15859
		Total time spent by all reduces in occupied slots (ms)=4321
		Total time spent by all map tasks (ms)=15859
		Total time spent by all reduce tasks (ms)=4321
		Total vcore-seconds taken by all map tasks=15859
		Total vcore-seconds taken by all reduce tasks=4321
		Total megabyte-seconds taken by all map tasks=16239616
		Total megabyte-seconds taken by all reduce tasks=4424704
	Map-Reduce Framework
		Map input records=2
		Map output records=4
		Map output bytes=36
		Map output materialized bytes=56
		Input split bytes=286
		Combine input records=0
		Combine output records=0
		Reduce input groups=2
		Reduce shuffle bytes=56
		Reduce input records=4
		Reduce output records=0
		Spilled Records=8
		Shuffled Maps =2
		Failed Shuffles=0
		Merged Map outputs=2
		GC time elapsed (ms)=245
		CPU time spent (ms)=1260
		Physical memory (bytes) snapshot=458809344
		Virtual memory (bytes) snapshot=8175378432
		Total committed heap usage (bytes)=262033408
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=236
	File Output Format Counters 
		Bytes Written=97
Job Finished in 30.938 seconds
Estimated value of Pi is 4.00000000000000000000
[[email protected] mapreduce2]$

分散式資源排程框架 ——YARN

1 YARN 產生背景 MapReduce1.x 存在的問題：單點故障和節點壓力大不易擴充套件； Hadoop1.x 時，MapReduce -> Master/Slave 架構，1個 JobTracker 帶多個 TaskTracker JobTrack

大資料-Hadoop-分散式資源排程系統YARN部署

1：YARN部署 1.1：etc/hadoop/mapred-site.xml: <property> <name>mapreduce.framework.name</name&

資源排程框架YARN

一.產生背景　　Hadoop1.0的時候是沒有YARN，MapReduce1.X存在的問題：單點故障&節點壓力大不易擴充套件　　JobTracker：負責資源管理和作業排程　　TaskTracker：定期向JobTracker彙報本節點的健康狀況、資源使用情況以及作業執行情況；　　接收來自

資源排程框架YARN解析

Yarn作為Hadoop的資源排程框架，承擔著擴充套件Hadoop的重要責任，我們配置Spark時就使用了Spark on Yarn的配置方法，這裡簡單介紹一些YARN的工作原理，有助於理解整個系統處理作業的過程。在配置好的Hadoop環境中，我們輸入jps檢

27課：SPARK 執行在yarn資源排程框架 client 、cluster方式！！

分散式叢集 [email protected]:/usr/local/hadoop-2.6.0/etc/hadoop# vi /etc/hosts 127.0.0.1 localhost 192.168.189.1 master 192.168.189

LTS原理--輕量級分散式任務排程框架(Light Task Schedule)（一）

LTS(light-task-scheduler)主要用於解決分散式任務排程問題，支援實時任務，定時任務和Cron任務。有較好的伸縮性，擴充套件性，健壯穩定性而被多家公司使用，同時也希望開源愛好者一起貢獻。專案地址這兩個地址都會同步更新。感興趣，請加Q

Elastic-job實戰(分散式作業排程框架)

就拿一個場景來說吧，如果我們的專案是部署到多臺機器上，那麼某一時刻，我們的定時任務肯定每臺機器上都會執行一遍，那這肯定不是我們想要的結果，我們只希望有一臺機器能執行。一.前言 Elastic job是噹噹網架構師基於Zookepper、Quartz開發並開源的

【stark_summer的專欄】專注於開發分散式任務排程框架、分散式同步RPC、非同步MQ訊息佇列、分散式日誌檢索框架、hadoop、spark、scala等技術如果我的寫的文章能對您有幫助，請您能給點捐助,請看首頁置頂

專注於開發分散式任務排程框架、分散式同步RPC、非同步MQ訊息佇列、分散式日誌檢索框架、hadoop、spark、scala等技術如果我的寫的文章能對您有幫助，請您能給點捐助,請看首頁置頂...

分散式開源排程框架TBSchedule原理與應用

主要內容：第一部分 TBSchedule基本概念及原理 1. 概念介紹 2. 工作原理 3. 原始碼分析 4. 與其他開源排程框架對比第二部分 TBSchedule分散式排程示例 1. TBSchedu

Mesos---分散式資源管理框架

對Mesos Slave，實現了Slave的恢復功能，當Slave節點上的程序失敗時，可以讓執行器/任務繼續執行，併為那個Slave程序重新連線那臺Slave節點上執行的執行器/任務。當任務執行時，Slave會將任務的監測點元資料存入本地磁碟。如果Slave程序失敗，任務會繼續執行，當Mas

LTS 輕量級分散式任務排程框架(Light Task Scheduler)

框架概況： LTS是一個輕量級分散式任務排程框架。有三種角色, JobClient, JobTracker, TaskTracker。各個節點都是無狀態的，可以部署多個，來實現負載均衡，實現更大的負載量, 並且框架具有很好的容錯能力。採用多種註冊中心（Zoo

elastic-job分散式作業排程框架簡介

1.elastic-job是什麼？elastic-job是噹噹內部應用框架ddframe中dd-job的作業模組中分離出來的分散式彈性作業框架。2. 什麼是作業排程（定時任務）?作業即定時任務。一般來說，系統可使用訊息傳遞代替部分使用作業的場景。兩者確有相似之處。可互相替換的

MapReduce再學習：資源管理框架YARN

在前面寫到的三篇部落格中，HDFS概述和 MapReduce簡介寫的都是hadoop1.0的情況，針對1.0版本的各種不足，2.0都有相應的改動， HDFS再學習：HA和Federation機制寫的是儲存系統HDFS上的改動。針對我們的計算模型MapReduc

Quartz分散式任務排程框架

分散式任務排程任務排程是指基於給定的時間點，給定的時間間隔或者給定執行次數自動的執行任務。任務排程涉及到多執行緒併發、執行時

分散式任務排程框架 Azkaban —— Flow 1.0 的使用

一、簡介 Azkaban 主要通過介面上傳配置檔案來進行任務的排程。它有兩個重要的概念： Job：你需要執行的排程任務； Flow：一個獲取多個 Job 及它們之間的依賴關係所組成的圖表叫做 Flow。目前 Azkaban 3.x 同時支援 Flow 1.0 和 Flow 2.0，本文主要講解 Flo

分散式任務排程框架 Azkaban —— Flow 2.0 的使用

一、Flow 2.0 簡介 1.1 Flow 2.0 的產生 Azkaban 目前同時支援 Flow 1.0 和 Flow2.0 ，但是官方文件上更推薦使用 Flow 2.0，因為 Flow 1.0 會在將來的版本被移除。Flow 2.0 的主要設計思想是提供 1.0 所沒有的流級定義。使用者可以將屬於給定流

自己動手實現分散式任務排程框架(續)

　　之前寫過一篇:自己動手實現分散式任務排程框架本來是用來閒來分享一下自己的思維方式，時至今日發現居然有些人正在使用了，本著對程式碼負責人的態度，對程式碼部分已知bug進行了修改，並增加了若干功能，如立即啟動，實時停止等功能，新增加的功能會在這一篇做詳細的說明。　　提到分散式任務排程，市面上本身已經有一些框

大資料（5）---分散式任務資源排程Yarn

前面也說到過的Yarn是hadoop體系中的資源排程平臺。所以在整個hadoop的包裡面自然也是有它的。這裡我們就簡單介紹下，並配置搭建yarn叢集。首先來說Yarn中有兩大核心角色Resource Manager和Node Manager。 Resource Manager負責接收使用者提交

大資料（十六）：Yarn的工作機制、資源排程器、任務的推測執行機制

一、Yarn概述 Yarn是一個資源排程平臺，負責為運算程式提供伺服器運算資源，相當於一個分散式的作業系統平臺，而MapReduce等運算程式則相當於運行於操作程式上的應用程式。二、Yarn基本架

分散式任務編排排程框架設計

運維焦油坑隨著網際網路+和去IOE浪潮的推進，傳統行業X86伺服器的數量逐漸增多。伺服器數量劇增帶來的直接後果就是運維複雜度的增加。原本一個人可以輕鬆維護十幾臺甚至幾十臺伺服器：寫幾個常用的監控和配置下發指令碼、或者利用cronTab製作幾個定時任務就可以搞定。當伺服器的數量由幾

分散式資源排程框架 ——YARN

1 YARN 產生背景

2 YARN 的架構

3 YARN 環境搭建

3.1 mapred-site.xml

3.2 yarn-site.xml

3.3 啟動 YARN

4 提交 MapReduce 作業到 YARN

相關推薦

3.1 `mapred-site.xml`

3.2 `yarn-site.xml`