yarn 源查看 yarn capacity scheduler 源码分析

转载

mob64ca13fc5fb6 2023-12-20 06:56:45

文章标签 yarn 源查看 hadoop big data 1024程序员节 ci 文章分类 Yarn 大数据

2021SC@SDUSC

Hadoop yarn源码分析（七） yarn调度器 2021SC@SDUSC

一、yarn调度器介绍

1.1 FIFO Scheduler
1.2 Capacity Scheduler
1.3 Fair Scheduler
1.4 Fair Scheduler与Capacity Scheduler区别

二、yarn调度器配置

2.1 FIFO Scheduler
2.2 Capacity Scheduler
2.3 Fair Scheduler配置

一、yarn调度器介绍

Hadoop是一个复杂的集群，资源有限，一个应用程序的资源请求通常要等待一段时间后才能得到响应。而在yarn中，Scheduler负责分配资源，并提供了多种调度器和调度策略。

1.1 FIFO Scheduler

FIFO Scheduler（先入先出调度器），是一个先入先出队列，将应用程序按照到达的时间排成一个队列。分配资源的时候，先分配队列最头上的资源，再按照顺序分配后面的。FIFO Scheduler是最简单的调度器，不需要任何配置，但是不适用于共享集群。当遇到较大的应用程序时，后面的小任务会被阻塞。

yarn 源查看 yarn capacity scheduler 源码分析_hadoop

1.2 Capacity Scheduler

Capacity Scheduler（容量调度器），与FIFO Scheduler不同的是，它有一个专门的队列用来运行小任务，但是为小任务单独建立的队列会占用一定的调度资源，从而导致大任务的执行时间会长于FIFO调度的时间。另外，Capacity Scheduler时yarn-site.xml中默认配置的资源调度器。

yarn 源查看 yarn capacity scheduler 源码分析_ci_02

1.3 Fair Scheduler

Fair Scheduler（公平调度器），主要通过设置参数来为所有的应用程序分配公平的资源。Fair调度器可以再多个队列之间工作。想象以下情景，有两个用户A、B，分别有一个队列。当A启动一个job而B没有任务时，A会获得全部集群资源；当B启动一个job后，A的job会继续运行，不过一会儿之后两个任务会各自获得一半的集群资源。如果此时B再启动第二个job并且其它job还在运行，则它将会和B的第一个job共享B这个队列的资源，也就是B的两个job会用于四分之一的集群资源，而A的job仍然用于集群一半的资源，结果就是资源最终在A、B之间平等共享。

在Fair调度器中，会为所有运行中的job动态地调整资源，不需要预先占用资源。当一个大任务提交时，只有这一个job在运行，此时它获得了所有集群资源，而当第二个小任务提交后，Fair调度器会分配一半资源给这个小任务，让这两个job公平地共享资源。

yarn 源查看 yarn capacity scheduler 源码分析_yarn 源查看_03

1.4 Fair Scheduler与Capacity Scheduler区别

（1）资源公平共享：资源平均分配，默认情况下，每个队列都采用该方式。
（2）支持资源抢占：当某个队列中有剩余资源时，调度器会将这些资源共享给其他队列，而当该队列中有新的应用程序提交时，调度器要为它回收资源。为了尽可能降低不必要的计算浪费，调度器采用了先等待再强制回收的策略，即如果等待一段时间后尚有未归还的资源，则会进行资源抢占；从那些超额使用资源的队列中杀死一部分任务，进而释放资源。
（3）负载均衡：Fair Scheduler提供了一个基于job数量的负载均衡机制，尽可能保证任务均匀分配，用户也可以根据需求自己设计负载均衡机制。
（4）调度策略灵活配置：允许管理员单独为每个队列设置调度策略。
（5）提高小任务的响应时间：小任务可以快速获取资源并运行。

二、yarn调度器配置

2.1 FIFO Scheduler

在yarn-site.xml中配置调度器参数

<property>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.fifo.FifoScheduler</value>
</property>

2.2 Capacity Scheduler

（1）在yarn-site.xml中配置调度器参数

<property>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.capacity.CapacityScheduler</value>
</property>

Capacity调度器允许多个组织共享集群资源，每个组织可以获得集群的一部分计算能力。为每个组织单独分配队列，再为每个队列分配一定的集群资源，以达到为整个集群设置多个队列为多个组织提供服务的目的。而一个队列内部采用的时FIFO策略。
（2）在capacity-scheduler.xml中配置调度器参数

-<property>

<name>yarn.scheduler.capacity.maximum-applications</name>

<value>10000</value>

<description>Maximum number of applications that can be pending and running. </description>

</property>


-<property>

<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>

<value>0.1</value>

<description>Maximum percent of resources in the cluster which can be used to runapplication masters i.e. controls number of concurrent runningapplications. </description>

</property>


-<property>

<name>yarn.scheduler.capacity.resource-calculator</name>

<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>

<description>The ResourceCalculator implementation to be used to compareResources in the scheduler.The default i.e. DefaultResourceCalculator only uses Memory whileDominantResourceCalculator uses dominant-resource to comparemulti-dimensional resources such as Memory, CPU etc. </description>

</property>


-<property>

<name>yarn.scheduler.capacity.root.queues</name>

<value>default</value>

<description>The queues at the this level (root is the root queue). </description>

</property>


-<property>

<name>yarn.scheduler.capacity.root.default.capacity</name>

<value>100</value>

<description>Default queue target capacity.</description>

</property>


-<property>

<name>yarn.scheduler.capacity.root.default.user-limit-factor</name>

<value>1</value>

<description>Default queue user limit a percentage from 0.0 to 1.0. </description>

</property>

对于Capacity调度器，队列名必须是队列树中的最后一部分。比如，在上面配置中，使用prod和eng作为队列名是可以的，但是如果用root.dev.eng或者dev.eng是无效的。

2.3 Fair Scheduler配置

（1）在yarn-site.xml中配置调度器参数

<!– scheduler start –>
    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
        <description>配置Yarn使用的调度器插件类名；Fair Scheduler对应的是：org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</description>
    </property>
    <property>
        <name>yarn.scheduler.fair.allocation.file</name>
        <value>/etc/hadoop/conf/fair-scheduler.xml</value>
        <description>配置资源池以及其属性配额的XML文件路径（本地路径）</description>
    </property>
    <property>
        <name>yarn.scheduler.fair.preemption</name>
        <value>true</value>
        <description>开启资源抢占,default is True</description>
    </property>
    <property>
        <name>yarn.scheduler.fair.user-as-default-queue</name>
        <value>true</value>
        <description>设置成true，当任务中未指定资源池的时候，将以用户名作为资源池名。这个配置就实现了根据用户名自动分配资源池。default is True</description>
    </property>
    <property>
        <name>yarn.scheduler.fair.allow-undeclared-pools</name>
        <value>false</value>
        <description>是否允许创建未定义的资源池。如果设置成true，yarn将会自动创建任务中指定的未定义过的资源池。设置成false之后，任务中指定的未定义的资源池将无效，该任务会被分配到default资源池中。,default is True</description>
    </property>
    <!– scheduler end –>

（2）在fair-scheduler.xml中配置各队列的资源量、权重等信息。

<?xml version="1.0"?>
<allocations>
    <queue name="root">
        <aclSubmitApps></aclSubmitApps>
        <aclAdministerApps></aclAdministerApps>
        <queue name="production">
            <minResources>8192mb,8vcores</minResources>
            <maxResources>419840mb,125vcores</maxResources>
            <maxRunningApps>60</maxRunningApps>
            <schedulingMode>fair</schedulingMode>
            <weight>7.5</weight>
            <aclSubmitApps>*</aclSubmitApps>
            <aclAdministerApps>production</aclAdministerApps>
        </queue>
        <queue name="spark">
            <minResources>8192mb,8vcores</minResources>
            <maxResources>376480mb,110vcores</maxResources>
            <maxRunningApps>50</maxRunningApps>
            <schedulingMode>fair</schedulingMode>
            <weight>1</weight>
            <aclSubmitApps>*</aclSubmitApps>
            <aclAdministerApps>spark</aclAdministerApps>
        </queue>
        <queue name="default">
            <minResources>8192mb,8vcores</minResources>
            <maxResources>202400mb,20vcores</maxResources>
            <maxRunningApps>20</maxRunningApps>
            <schedulingMode>FIFO</schedulingMode>
            <weight>0.5</weight>
            <aclSubmitApps>*</aclSubmitApps>
            <aclAdministerApps>*</aclAdministerApps>
        </queue>
        <queue name="streaming">
            <minResources>8192mb,8vcores</minResources>
            <maxResources>69120mb,16vcores</maxResources>
            <maxRunningApps>20</maxRunningApps>
            <schedulingMode>fair</schedulingMode>
            <aclSubmitApps>*</aclSubmitApps>
            <weight>1</weight>
            <aclAdministerApps>streaming</aclAdministerApps>
        </queue>
    </queue>
    <user name="production">
        <!-- 对于特定用户的配置:production最多可以同时运行的任务 -->
        <maxRunningApps>100</maxRunningApps>
    </user>
    <user name="default">
        <!-- 对于默认用户配置最多可以同时运行的任务 -->
        <maxRunningApps>10</maxRunningApps>
    </user>

    <!-- users max running apps -->
    <userMaxAppsDefault>50</userMaxAppsDefault>
    <!--默认的用户最多可以同时运行的任务 -->
    <queuePlacementPolicy>
        <rule name="specified"/> 
        <rule name="primaryGroup" create="false" />
        <rule name="secondaryGroupExistingQueue" create="false" />
        <rule name="default" queue="default"/>
    </queuePlacementPolicy>
</allocations>

Fair调度器采用了一套基于规则的系统来确定应用应该放到哪个队列。在上面的例子中，queuePlacementPolicy 元素定义了一个规则列表，其中的每个规则会被逐个尝试直到匹配成功。例如，上例第一个规则specified，则会把应用放到它指定的队列中，若这个应用没有指定队列名或队列名不存在，则说明不匹配这个规则，然后尝试下一个规则。primaryGroup规则会尝试把应用放在以用户所在的Unix组名命名的队列中，如果没有这个队列，不创建队列转而尝试下一个规则。当前面所有规则不满足时，则触发default规则，把应用放在default队列中。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：python 接收报文 python发送arp报文

下一篇：java 监听rabbitmq消息 rabbitmq如何监听消息

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯