Spark1.6-----原始碼解讀之DAGScheduler

阿新 • • 發佈：2018-12-16

純滑鼠點程式碼寫出來的，閱讀時希望你能跟著這樣操作。

DAGScheduler的主要用於在任務正式提交給TaskSchedulerImpl提交之前做一些準備工作。比如建立job，將DAG的RDD劃分到不同的stage，提交stage

SparkContext 525行建立DAGScheduler:

    _dagScheduler = new DAGScheduler(this)

DAGScheduler 133行為其維護的主要資料結構如下：

主要維護jobId和stageId的關係,Stage,ActiveJob，以及快取的RDD的partitions的位置資訊

  private[spark] val metricsSource: DAGSchedulerSource = new DAGSchedulerSource(this)

  private[scheduler] val nextJobId = new AtomicInteger(0)
  private[scheduler] def numTotalJobs: Int = nextJobId.get()
  private val nextStageId = new AtomicInteger(0)

  private[scheduler] val jobIdToStageIds = new HashMap[Int, HashSet[Int]]
  private[scheduler] val stageIdToStage = new HashMap[Int, Stage]
  private[scheduler] val shuffleToMapStage = new HashMap[Int, ShuffleMapStage]
  private[scheduler] val jobIdToActiveJob = new HashMap[Int, ActiveJob]

  // Stages we need to run whose parents aren't done
  private[scheduler] val waitingStages = new HashSet[Stage]

  // Stages we are running right now
  private[scheduler] val runningStages = new HashSet[Stage]

  // Stages that must be resubmitted due to fetch failures
  private[scheduler] val failedStages = new HashSet[Stage]

  private[scheduler] val activeJobs = new HashSet[ActiveJob]

  /**
   * Contains the locations that each RDD's partitions are cached on.  This map's keys are RDD ids
   * and its values are arrays indexed by partition numbers. Each array value is the set of
   * locations where that RDD partition is cached.
   *
   * All accesses to this map should be guarded by synchronizing on it (see SPARK-4454).
   */
  private val cacheLocs = new HashMap[Int, IndexedSeq[Seq[TaskLocation]]]

  // For tracking failed nodes, we use the MapOutputTracker's epoch number, which is sent with
  // every task. When we detect a node failing, we note the current epoch number and failed
  // executor, increment it for new tasks, and use this to ignore stray ShuffleMapTask results.
  //
  // TODO: Garbage collect information about failure epochs when we know there are no more
  //       stray messages to detect.
  private val failedEpoch = new HashMap[String, Long]

  private [scheduler] val outputCommitCoordinator = env.outputCommitCoordinator

  // A closure serializer that we reuse.
  // This is only safe because DAGScheduler runs in a single thread.
  private val closureSerializer = SparkEnv.get.closureSerializer.newInstance()

DAGScheduler 184行建立DAGSchedulerEventProcessLoop 主要負責對訊息的接受和處理

  private[scheduler] val eventProcessLoop = new DAGSchedulerEventProcessLoop(this)

DAGScheduler 1588為DAGSchedulerEventProcessLoop 具體實現：

private[scheduler] class DAGSchedulerEventProcessLoop(dagScheduler: DAGScheduler)
  extends EventLoop[DAGSchedulerEvent]("dag-scheduler-event-loop") with Logging {

該類繼承EventLoop，去看看EvenLoop具體實現：

private val eventThread = new Thread(name) {
    //可以看出為守護程序
    setDaemon(true)

    override def run(): Unit = {
      try {
        //一直迴圈
        while (!stopped.get) {
          //獲取佇列中資料並且處理，如果沒有就阻塞
          val event = eventQueue.take()
          try {
            onReceive(event)
          } catch {
            case NonFatal(e) => {
              try {
                onError(e)
              } catch {
                case NonFatal(e) => logError("Unexpected error in " + name, e)
              }
            }
          }
        }
      } catch {
        case ie: InterruptedException => // exit even if eventQueue is not empty
        case NonFatal(e) => logError("Unexpected error in " + name, e)
      }
    }

DAGScheduler 1605行為 DAGSchedulerEventProcessLoop 能處理的訊息：

 private def doOnReceive(event: DAGSchedulerEvent): Unit = event match {
    case JobSubmitted(jobId, rdd, func, partitions, callSite, listener, properties) =>
      dagScheduler.handleJobSubmitted(jobId, rdd, func, partitions, callSite, listener, properties)

    case MapStageSubmitted(jobId, dependency, callSite, listener, properties) =>
      dagScheduler.handleMapStageSubmitted(jobId, dependency, callSite, listener, properties)

    case StageCancelled(stageId) =>
      dagScheduler.handleStageCancellation(stageId)

    case JobCancelled(jobId) =>
      dagScheduler.handleJobCancellation(jobId)

    case JobGroupCancelled(groupId) =>
      dagScheduler.handleJobGroupCancelled(groupId)

    case AllJobsCancelled =>
      dagScheduler.doCancelAllJobs()

    case ExecutorAdded(execId, host) =>
      dagScheduler.handleExecutorAdded(execId, host)

    case ExecutorLost(execId) =>
      dagScheduler.handleExecutorLost(execId, fetchFailed = false)

    case BeginEvent(task, taskInfo) =>
      dagScheduler.handleBeginEvent(task, taskInfo)

    case GettingResultEvent(taskInfo) =>
      dagScheduler.handleGetTaskResult(taskInfo)

    case completion @ CompletionEvent(task, reason, _, _, taskInfo, taskMetrics) =>
      dagScheduler.handleTaskCompletion(completion)

    case TaskSetFailed(taskSet, reason, exception) =>
      dagScheduler.handleTaskSetFailed(taskSet, reason, exception)

    case ResubmitFailedStages =>
      dagScheduler.resubmitFailedStages()
  }

DAGScheduler建立完畢

總結一下：

DAGScheduler中有一堆維護job stage的資料結構。

生成DAGSchedulerEventProcessLoop用來處理各種事件。

Spark1.6-----原始碼解讀之DAGScheduler

純滑鼠點程式碼寫出來的，閱讀時希望你能跟著這樣操作。 DAGScheduler的主要用於在任務正式提交給TaskSchedulerImpl提交之前做一些準備工作。比如建立job，將DAG的RDD劃分到不同的stage，提交stage SparkContext 525行建立DAGSchedul

Spark1.6-----原始碼解讀之BlockManager元件shuffle服務和客戶端

spark是分散式部署的，每個Task最終都執行在不同的機器節點上，map任務的輸出結果直接儲存到map任務所在的機器的儲存體系，reduce極有可能不再同一個機器上執行，所以需要遠端下載map任務的中間輸出。所以儲存系統中也包含ShuffleClient。在BlockManager 176行

Spark1.6-----原始碼解讀之BlockManager的概述

BlockManager的實現 BlockManager是spark儲存體系中的核心元件，Driver 和Executor都會建立BlockManager。在SparkEnv 364行會建立BlockManager： // NB: blockManager is not val

Spark1.6-----原始碼解讀之TaskScheduler啟動

必須啟動TaskScheduler才能讓他發揮作用 SparkContext 530行： _taskScheduler.start() 實際去調TaskSchedulerImpl 143行： override def start() { backend.start

Spark1.6-----原始碼解讀之TaskScheduler

TaskScheduler是SparkContext重要成員之一，負責任務的提交，並且請求叢集管理器對任務排程。他也可以看做任務排程的客戶端。 SparkContext 522行建立TaskScheduler： val (sched, ts) = SparkContex

Spark1.6-----原始碼解讀之SparkEnv

我是跟著原始碼點進去一步一寫的，所以觀看時希望大家能一步一跟著原始碼走，不要只看博文。在SparkContext 284行建立SparkEnv: // This function allows components created by SparkEnv to be mocked in

Spark1.6-----原始碼解讀之BlockManager元件MemoryStore

MemoryStore負責將沒有序列化的java物件陣列或者序列化的ByteBuffer儲存到記憶體中： MemoryStore記憶體模型 maxUnrollMemory：當前Driver或者Executor的block最多提前佔用的記憶體的大小，每個執行緒都能佔記憶體。(類似上課佔座，人沒

【1】pytorch torchvision原始碼解讀之Alexnet

最近開始學習一個新的深度學習框架PyTorch。框架中有一個非常重要且好用的包：torchvision，顧名思義這個包主要是關於計算機視覺cv的。這個包主要由3個子包組成，分別是：torchvision.datasets、torchvision.models、torchvision.trans

java原始碼解讀之HashMap

1:首先下載openjdk(http://pan.baidu.com/s/1dFMZXg1),把原始碼匯入eclipse,以便看到jdk原始碼 Windows-Prefe

PyTorch原始碼解讀之torch.utils.data.DataLoader(轉)

原文連結 https://blog.csdn.net/u014380165/article/details/79058479 寫得特別好！最近正好在學習pytorch，學習一下！ PyTorch中資料讀取的一個重要介面是torch.utils.data.DataLoade

PyTorch原始碼解讀之torchvision.models(轉)

原文地址：https://blog.csdn.net/u014380165/article/details/79119664 PyTorch框架中有一個非常重要且好用的包：torchvision，該包主要由3個子包組成，分別是：torchvision.datasets、torchvision.mode

PyTorch原始碼解讀之torchvision.transforms（轉）

原文地址：https://blog.csdn.net/u014380165/article/details/79167753 PyTorch框架中有一個非常重要且好用的包：torchvision，該包主要由3個子包組成，分別是：torchvision.dat

jQuery原始碼解讀之init函式

jQuery的構造方法： // 直接new了一個物件。同時根據jQuery.fn = jQuery.prototype，jQuery.fn相當於jQuery.prototype。 jQuery = function( selector, context ) { return

PyTorch原始碼解讀之torchvision.transforms

PyTorch框架中有一個非常重要且好用的包：torchvision，該包主要由3個子包組成，分別是：torchvision.datasets、torchvision.models、torchvision.transforms。這3個子包的具體介紹可以參考

eureka原始碼解讀之服務端

剖析eureka服務端啟動流程服務端啟動類-入口處 @EnableEurekaServer @SpringBootApplication public class EurekaServerApplication { public static void main(Strin

Dubbo原始碼解讀之動態代理

前言或許我們已悉知Java的動態代理的方式：jdk——通過介面中的方法名，在動態生成的代理類中呼叫業務實現類的同名方法；cglib——通過繼承業務類，生成的動態代理類是業務類的子類，通過重寫業務方法進行代理。dubbo在沿用java的jdk方式外，還採取了javassist方式——通過

Lumen開發：lumen原始碼解讀之初始化(2)——門面(Facades)與資料庫(db)

緊接上一篇 $app->withFacades();//為應用程式註冊門面。 $app->withEloquent();//為應用程式載入功能強大的庫。先來看看withFacades() /** * Register the facades

Lumen開發：lumen原始碼解讀之初始化(1)——app例項

先來看看入口檔案public/index.php //請求頭 header('Content-Type: application/json; charset=utf-8'); /* |-------------------------------------------------

Spring原始碼解讀之核心容器上節

Spring架構圖說明 Spring的流行程度就不用我來說了，相信大家如果使用JAVA開發就一定知道它。寫這篇文章的初衷在於：1.瞭解Spring底層實現原理，提升對Spring的認識與理解。2.學習優秀框架程式設計實現，學習優秀的設計模式。3.使用Spring三年多，對於底層細節希望知道更多，便於

Spring原始碼解讀之核心容器下節

續上一篇我們通過ClassPathXmlApplicationContext載入xml檔案，通過BeanFactory獲取例項bean的demo程式碼去解讀了Spring Core Container中的spring-beans,spring-core,spring-context三個元件之間的一些具體類

Spark1.6-----原始碼解讀之DAGScheduler

總結一下：

相關推薦