1. 程式人生 > 其它 >Yarn 容量排程器多佇列提交案例

Yarn 容量排程器多佇列提交案例

目錄

Yarn 容量排程器多佇列提交案例

預設只有一個default佇列,不能滿足生產要求。一般按照業務模組如登入註冊、購物車等建立佇列。

需求

需求1:default佇列佔總記憶體的40%,最大資源容量佔總資源60%(本身佔40%可以再借用20%),hive佇列佔總記憶體的60%,最大資源容量佔總資源80%
需求2:配置佇列優先順序

配置多佇列的容量排程器

在/opt/module/hadoop-3.1.3/etc/hadoop下的capacity-scheduler.xml中配置

1 修改如下配置

直接配置不好配,我們先下載

[ranan@hadoop102 hadoop]$ sz capacity-scheduler.xml

修改如下配置


<property>
    <name>yarn.scheduler.capacity.root.queues</name>
	<!--增加hive佇列 -->
    <value>default,hive</value>
    <description>
      The queues at the this level (root is the root queue).
    </description>
</property>

<property>
    <name>yarn.scheduler.capacity.root.default.capacity</name>
	<!--default佇列佔總記憶體的40%-->
    <value>40</value>
    <description>Default queue target capacity.</description>
</property>

<!--增加hive配置-->
<property>
    <name>yarn.scheduler.capacity.root.hive.capacity</name>
	<!--hive佇列佔總記憶體的40% -->
    <value>60</value>
    <description>Default queue target capacity.</description>
</property>

<!--新增hive配置,使用者提交任務時可以佔hive佇列總資源的多少,1表示可以把hive佇列的所有資源用盡-->
<property>
    <name>yarn.scheduler.capacity.root.hive.user-limit-factor</name>
    <value>1</value>
    <description>
      hive queue user limit a percentage from 0.0 to 1.0.
    </description>
</property>

<!--default最大可以佔root資源的60%,本身有40%,最多可以借20%,最大資源容量-->
<property>
    <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
    <value>60</value>
    <description>
      The maximum capacity of the default queue. 
    </description>
</property>

<!--新增-->
<property>
    <name>yarn.scheduler.capacity.root.hive.maximum-capacity</name>
    <value>80</value>
    <description>
      The maximum capacity of the hive queue. 
    </description>
</property>

<!--新增,預設該佇列是啟動狀態-->
<property>
   <name>yarn.scheduler.capacity.root.hive.state</name>
   <value>RUNNING</value>
   <description>
   The state of the hive queue. State can be one of RUNNING or STOPPED.
   </description>
</property>

<!--新增,配置哪些使用者可以向該佇列提交任務 * 表示所有使用者-->
 <property>
   <name>yarn.scheduler.capacity.root.hive.acl_submit_applications</name>
   <value>*</value>
   <description>
   The ACL of who can submit jobs to the hive queue.
   </description>
</property>

<!--新增,配置哪些使用者可以對該佇列進行操作許可權(管理員)-->
<property>
   <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
   <value>*</value>
   <description>
   The ACL of who can administer jobs on the hive queue.
   </description>
</property>

<!--新增,哪些使用者可以設定該佇列的優先順序-->
<property>   
	<name>yarn.scheduler.capacity.root.hive.acl_application_max_priority</name>
    <value>*</value>
    <description>
      The ACL of who can submit applications with configured priority.
      For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
    </description>
  </property>

<!-- 任務的超時時間設定: yarn application -appId appId -updateLifetime Timeout(Timeout自己設定)  到時間任務會被kill-->
<!-- 新增  Timeout不能隨便指定,不能超過以下引數配置的時間。-->
   <property>
     <name>yarn.scheduler.capacity.root.hive.maximum-application-lifetime
     </name>
     <value>-1</value>
     <description>
        Maximum lifetime of an application which is submitted to a queue
        in seconds. Any value less than or equal to zero will be considered as
        disabled.
        This will be a hard time limit for all applications in this
        queue. If positive value is configured then any application submitted
        to this queue will be killed after exceeds the configured lifetime.
        User can also specify lifetime per application basis in
        application submission context. But user lifetime will be
        overridden if it exceeds queue maximum lifetime. It is point-in-time
        configuration.
        Note : Configuring too low value will result in killing application
        sooner. This feature is applicable only for leaf queue.
     </description>
   </property>

<!--新增 如果 application 沒指定超時時間,則用 default-application-lifetime 作為預設值 -1表示不受限想執行多久就執行多久-->
   <property>
     <name>yarn.scheduler.capacity.root.hive.default-application-lifetime
     </name>
     <value>-1</value>
     <description>
        Default lifetime of an application which is submitted to a queue
        in seconds. Any value less than or equal to zero will be considered as
        disabled.
        If the user has not submitted application with lifetime value then this
        value will be taken. It is point-in-time configuration.
        Note : Default lifetime can't exceed maximum lifetime. This feature is
        applicable only for leaf queue.
     </description>
   </property>

補充:
容量排程器所有的佇列從根目錄開始?

SecureCRT的上傳和下載

SecureCRT下載sz(send傳送)
下載一個檔案:sz filename
下載多個檔案:sz filename1 filename2
下載dir目錄下的所有檔案,不包含dir下的資料夾:sz dir/*
rz(received)上傳

2 上傳到叢集並分發

[ranan@hadoop102 hadoop]$ rz
[ranan@hadoop102 hadoop]$ xsync capacity-scheduler.xml

3 重啟Yarn或yarn rmadmin -refreshQueues

重啟Yarn或者執行yarn rmadmin -refreshQueues

更新yarn佇列相關配置

[ranan@hadoop102 hadoop]$ yarn rmadmin -refreshQueues

4 向Hive佇列提交任務

知識點:-D 表示執行時改變引數值

-D mapreduce.job.queuename=hive

[atguigu@hadoop102 hadoop-3.1.3]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount -D mapreduce.job.queuename=hive /input /output

提交到了hive佇列,預設是default佇列

提交方式-打jar包的方式

如果是自己寫的程式,可以再打包的配置資訊Driver中宣告提交到哪個佇列

public class WcDrvier {
	public static void main(String[] args) throws IOException,
		ClassNotFoundException, InterruptedException {
		Configuration conf = new Configuration();
		conf.set("mapreduce.job.queuename","hive");
		//1. 獲取一個 Job 例項
		Job job = Job.getInstance(conf);
		....
			
		//6. 提交 Job
		boolean b = job.waitForCompletion(true);
		System.exit(b ? 0 : 1);
}
}

任務優先順序

容量排程器,在資源緊張時,優先順序高的任務將優先獲取資源。
預設情況,所有任務優先順序為0,如果需要使用任務優先順序,需要做相關的配置。

任務優先順序的使用

在/opt/module/hadoop-3.1.3/etc/hadoop下的yarn-site.xml中配置

1.修改 yarn-site.xml 檔案,增加以下引數

<property>
<name>yarn.cluster.max-application-priority</name>
<!--設定有5個優先順序等級,0最低5最高-->
<value>5</value>
</property>

2.分發配置,並重啟 Yarn

[ranan@hadoop102 hadoop]$ xsync yarn-site.xml
//僅重啟Yarn
[ranan@hadoop102 hadoop-3.1.3]$ sbin/stop-yarn.sh
[ranan@hadoop102 hadoop-3.1.3]$ sbin/start-yarn.sh

3.模擬資源緊張環境, 可連續提交以下任務,直到新提交的任務申請不到資源為止。

//求pi 執行了2000000次
[ranan@hadoop102 hadoop-3.1.3]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar pi 5 2000000

4.再次重新提交優先順序高的任務,讓優先順序高的任務限制性

-D mapreduce.job.priority=5

[ranan@hadoop102 hadoop-3.1.3]$ hadoop jar /opt/module/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar pi -D mapreduce.job.priority=5 5 2000000

5.如果優先順序高的任務已經提交到叢集上了,也可以通過以下命令修改正在執行的任務的優先順序。

yarn application -appID <ApplicationID> -updatePriority 優先順序

[ranan@hadoop102 hadoop-3.1.3]$ yarn application -appID application_1611133087930_0009 -updatePriority 5