hadoop fair scheduler和capacity scheduler都没有按预期进行调度

8tntrjer 于 2021-06-04 发布在 Hadoop

关注(0)|答案(2)|浏览(306)

我使用cdh5.1.0（hadoop2.3.0）。2个名称节点（2x 32gb ram，2核）和3个数据节点（3x 16gb ram，2核）
我正在从默认队列中的单个用户调度mapreduce作业（没有其他用户，也没有配置其他队列）。
使用capacity scheduler时，会发生以下情况：我可以提交多个作业，但只有两个作业并行执行（状态为“running”）。
当使用fair scheduler时，会发生以下情况：我提交了多个作业，集群/调度器将4个作业设置为“running”状态。这些工作永远保持5%的进度。如果单个作业被终止，新作业将再次被设置为5%的“正在运行”状态，没有进一步的进展。只有在少于4个作业并且没有其他作业提交到队列之后，作业才会开始执行。
我已经多次重新配置集群，但在使用容量调度器时无法增加正在运行的作业数，或者在使用公平调度器时无法避免挂起作业
我的问题是-如何配置集群/Yarn/调度程序/动态和静态资源池以使调度工作？
以下是一些配置参数：

yarn.scheduler.minimum-allocation-mb = 2GB
yarn.scheduler.maximum-allocation-mb = 12GB
yarn.scheduler.minimum-allocation-vcores = 1
yarn.scheduler.maximum-allocation-vcores = 2
yarn.nodemanager.resource.memory-mb = 12GB
yarn.nodemanager.resource.cpu-vcores  = 2
mapreduce.map.memory.mb = 12GB
mapreduce.reduce.memory.mb = 12GB
mapreduce.map.java.opts.max.heap = 9.6GB
mapreduce.reduce.java.opts.max.heap = 9.6GB
yarn.app.mapreduce.am.resource.mb = 12GB
ApplicationMaster Java Maximum Heap Size = 788MB
mapreduce.task.io.sort.mb = 1GB

我将静态和动态资源池保留为默认（cloudera）设置（例如，max running apps设置为空）

hadoop yarn scheduler cloudera Configuration

来源：https://stackoverflow.com/questions/24991793/both-hadoop-fair-scheduler-and-capacity-scheduler-not-scheduling-as-expected

2条答案

按热度按时间

n53p2ov01#

不是解决方案，而是可能的解决方法
在某个时候，我们和mapr咨询公司的christian neundorf讨论了这个问题，他声称fairscheduler中有一个死锁bug（不是cdh特有的，而是标准hadoop！）。
他提出了这个解决办法，但我想不起来我们是否试过。请使用在您自己的风险，我不保证这真的工作，并张贴这只为那些你们谁是真的绝望，并愿意尝试任何东西，使您的应用程序工作：
在yarn-site.xml中（不知道为什么必须设置）

<property>
    <name>yarn.scheduler.fair.user-as-default-queue</name>
    <value>false</value>
    <description>Disable username for default queue </description>
</property>

在fair-scheduler.xml中

<allocations>
    <queue name="default">
         <!-- you set an integer value here which is number of the cores at your disposal minus one (or more) -->
        <maxRunningApps>number of cores - 1</maxRunningApps>
   </queue>
</allocations>

赞(0）回复(0）举报 2021-06-04

nwsw7zdq2#

减少这些参数：

mapreduce.map.memory.mb
mapreduce.reduce.memory.mb
yarn.app.mapreduce.am.resource.mb

到6gb（并相应地减小堆大小）。
使用当前配置，您只能运行三个容器（每个节点一个）。
yarn作业至少需要两个容器才能运行（一个容器用于applicationmaster，另一个容器用于map或reduce任务）。因此，当您为三个不同的作业启动tree applicationmaster时，您很容易遇到这样的情况，因为您没有任何容器来执行实际的map/reduce处理，所以这些作业将永远挂在那里。
此外，您应该将集群上可以并行运行的应用程序数量限制为2或3个（因为您没有那么多资源）。

赞(0）回复(0）举报 2021-06-04

我来回答

hadoop fair scheduler和capacity scheduler都没有按预期进行调度

2条答案

相关问题

热门标签

最新问答