spark排程器FIFO,FAIR
阿新 • • 發佈:2020-07-05
Spark中的排程模式主要有兩種:FIFO和FAIR。預設情況下Spark的排程模式是FIFO(先進先出),誰先提交誰先執行,後面的任務需要等待前面的任務執行。而FAIR(公平排程)模式支援在排程池中為任務進行分組,不同的排程池權重不同,任務可以按照權重來決定執行順序。spark.scheduler.mode
來設定,可選的引數有FAIR和FIFO。
1.排程池比較
FIFO建立排程池為空。
FAIR重寫了buildpools的方法,讀取預設路徑 $SPARK_HOME/conf/fairscheduler.xml檔案。addtaskmanager方法把TaskSetManager存入rootPool對應的子pool;
2.排程演算法比較
FIFO:
FIFO模式的排程方式很容易理解,比較stageID,誰小誰先執行;
這也很好理解,stageID小的任務一般來說是遞迴的最底層,是最先提交給排程池的;
private[spark] class FIFOSchedulingAlgorithm extends SchedulingAlgorithm { override def comparator(s1: Schedulable, s2: Schedulable): Boolean = { val priority1 = s1.priority val priority2 = s2.priority var res= math.signum(priority1 - priority2) if (res == 0) { val stageId1 = s1.stageId val stageId2 = s2.stageId res = math.signum(stageId1 - stageId2) } if (res < 0) { true } else { false } } }
FAIR:
fair模式來說的話,稍微複雜一點;
但是還是比較容易看懂,
1.先比較兩個stage的 runningtask使用的核數,其實也可以理解為task的數量,誰小誰的優先順序高;
2.比較兩個stage的 runningtask 權重,誰的權重大誰先執行;
3.如果前面都一直,則比較名字了(字串比較),誰大誰先執行;
private[spark] class FairSchedulingAlgorithm extends SchedulingAlgorithm { override def comparator(s1: Schedulable, s2: Schedulable): Boolean = { val minShare1 = s1.minShare val minShare2 = s2.minShare val runningTasks1 = s1.runningTasks val runningTasks2 = s2.runningTasks val s1Needy = runningTasks1 < minShare1 val s2Needy = runningTasks2 < minShare2 val minShareRatio1 = runningTasks1.toDouble / math.max(minShare1, 1.0).toDouble val minShareRatio2 = runningTasks2.toDouble / math.max(minShare2, 1.0).toDouble val taskToWeightRatio1 = runningTasks1.toDouble / s1.weight.toDouble val taskToWeightRatio2 = runningTasks2.toDouble / s2.weight.toDouble var compare: Int = 0 if (s1Needy && !s2Needy) { return true } else if (!s1Needy && s2Needy) { return false } else if (s1Needy && s2Needy) { compare = minShareRatio1.compareTo(minShareRatio2) } else { compare = taskToWeightRatio1.compareTo(taskToWeightRatio2) } if (compare < 0) { true } else if (compare > 0) { false } else { s1.name < s2.name } }